***: bwn has joined #archiveteam-bs
schbirid2 has joined #archiveteam-bs
Pixi has quit IRC (Quit: Pixi)
schbirid has quit IRC (Read error: Operation timed out)
Pixi has joined #archiveteam-bs
godane has quit IRC (Remote host closed the connection)
godane has joined #archiveteam-bs
Coderjo has quit IRC (Remote host closed the connection)
M-WillBra is now known as WillBradl
jacketcha: so is batoto just going to die
***: godane has quit IRC (Read error: Operation timed out)
wbradley has joined #archiveteam-bs
qw3rty16 has joined #archiveteam-bs
wbradley is now known as zeeboots
WillBradl is now known as WillBra4
WillBra4 is now known as zyph
qw3rty15 has quit IRC (Read error: Operation timed out)
zyph is now known as zyphlar
zeeboots has left WeeChat 1.4
godane has joined #archiveteam-bs
godane: so i'm archivebox project maybe in alpha/stable stage
i found out that the build-in wifi rpi3 would disconnect alot if wireless power management
was on
so i added 'wireless-power off' to /etc/network/interfaces
it was working for about 15 minutes when i was loading tons of pages from kiwix
vs like 5 or 10 pages before disconnecting with power management on
***: Mateon1 has quit IRC (Read error: Connection reset by peer)
Mateon1 has joined #archiveteam-bs
icedice has joined #archiveteam-bs
icedice has quit IRC (Read error: Connection reset by peer)
octothorp has quit IRC (Remote host closed the connection)
jdude104 has quit IRC (Leaving)
jdude104 has joined #archiveteam-bs
jdude104 has quit IRC (Client Quit)
jdude104 has joined #archiveteam-bs
icedice has joined #archiveteam-bs
Kimmer has quit IRC (Leaving)
Ravenloft has quit IRC (Read error: Connection reset by peer)
jdude has joined #archiveteam-bs
jdude104 has quit IRC (Read error: Operation timed out)
icedice has quit IRC (Ping timeout: 245 seconds)
jdude has quit IRC (Leaving)
jdude104 has joined #archiveteam-bs
jdude104 has quit IRC (Client Quit)
octothorp has joined #archiveteam-bs
Kimmer has joined #archiveteam-bs
jschwart has joined #archiveteam-bs
Coderjo has joined #archiveteam-bs
BlueMaxim has quit IRC (Leaving)
JAA: jacketcha: Yes, #botato.
***: Smiley has joined #archiveteam-bs
SmileyG has quit IRC (Ping timeout: 260 seconds)
REiN^ has quit IRC (Remote host closed the connection)
odemg: SketchCow, claim the $100
https://twitter.com/_cryptome_/status/952168812505387008
https://splinternews.com/rogue-archivists-are-creating-a-copy-of-gawker-com-so-t-1793861301
godane, we're ripping pbs content, see https://i.imgur.com/qGRIO9R.png ... get in here https://discord.gg/RQpHMJP (did you already write something?) still, get in there <3
godane: charlie rose uses a custom script just for charlierose.com
*i uses a custom script
***: K4k has quit IRC (Read error: Connection reset by peer)
JAA: godane: What are you grabbing exactly? I had to ignore the actual videos in the ArchiveBot job towards the end because my machine had a forced reboot due to the Meltdown bug.
I'm planning to resume that though. There are about 5400 videos left IIRC.
godane: right now i'm grabbing the 762 version of the videos
i was downloading a month worth of videos and then upload them
my panic grab of 762 version is just in case shit hits the fan
JAA: Ok, the URLs I ignored look like this: https://pfm1hycdn01-a.akamaihd.net/788/1HY788_003_xp.f4v
godane: cause it should be around 2.5 to 3.0tb
JAA: The ArchiveBot job grabbed some 6 TB and the remaining videos will be another 2-3 TB.
godane: those f4v files most of the time don't exist
i'm also doing something crazy and making a mp3 collection from the charlie rose videos
the mp3 collection will be offer some hoarders with low disk space to have some sort of archive of it
btw other series i have to go after later is called 'The Open Mind'
SketchCow: odemg: I'm running something to pull out the gawker stuff.
I'm sure we used archivebot for it, not anything else, right
odemg: godane, ohh I know re crose stuff you sent me the script, just wondering about pbs
SketchCow, sound :D
SketchCow, you should likely tweet at them and let them know, get that money son!
JAA: godane: They do exist, but you can only access them if you set the correct referrer, otherwise you get the not found error.
***: mnjgno has joined #archiveteam-bs
mnjgno: hello! I did this: http://bookmarklets.htmlbin.net/archiving.html Have any of you know more services? Obviously all of you use more advanced tools (warc, extensions) but for a casual browsing, bookmarklets are excellent, so if any of you know about more services...? :D
SketchCow: The page should be a little pretty, and should have a way to preview what's IN the bookmarket.
Kaz: Igloo: https://twitter.com/emilybatty/status/952241942963851266
Igloo: holy
Kaz: assuming hoax, lots of people reporting it but i feel like there'd be some coverage
Igloo: Wow
Pretty wide spread
Kaz: https://twitter.com/NutzFordBucks/status/952243050675281922
mnjgno: @SketchCow, I am just gathering online archive services, so if you now more, :) obviously all can be improved.
SketchCow: That's fine
But I'm telling you "drag this bookmarklet to your bar" is the new "click on this awesome desktop toy.exe"
Document and make it easy to understand what these do
mnjgno: cool! I'll have in mind if I ever publish for more people. Although if doing that I should remove peep us then. thanks anyway :)
godane: JAA: whats the referer needed to get f4v file
***: Uzerus has joined #archiveteam-bs
Uzerus: jacketcha: missle? where?
Kaz: BBC news dropping in with the *slowest* breaking news alert ever http://www.bbc.co.uk/news/world-us-canada-42677604
JAA: godane: Something like https://charlierose.com/video/player/24740?autoplay=false (for the URL above) I think. I'm not sure how strictly they check.
mnjgno: https://www.buzzfeed.com/mbvd/false-alarm-ballistic-missile-threat-hawaii
jacketcha: Uzerus: Hawaii
but, false alarm I guess
JAA: godane: Apparently a referrer of https://charlierose.com/ is sufficient.
godane: tell me how to get this file: https://pfm1hycdn01-a.akamaihd.net/113/1HY113_007_lp.f4v
i can't get it to download even with charlierose.com as referer
***: Mateon1 has quit IRC (Read error: Operation timed out)
Mateon1 has joined #archiveteam-bs
JAA: godane: Hmm, yeah, neither can I. The server returns status 200 but an empty body.
The ArchiveBot job got the same result: 2017-12-02 22:57:21,338 - wpull.processor.web - INFO - Fetched ‘https://pfm1hycdn01-a.akamaihd.net/113/1HY113_007_lp.f4v’: 200 OK. Length: 0 [video/x-flv].
So I guess that file might be broken?
godane: that episode is the only lost one i can't get
plus side is the 2 segments from that episode do exist
jrwr: Kaz: Igloo https://streamable.com/6fs0n
what was broadcast to TV for the EAS Alert
mnjgno: by the way, any of you uses peeep.us to bypass robots.txt files?
Kaz: Huh
No, we just ignore them
mnjgno: ah oki
Igloo: jrwr: holy cow that is hard to read
***: REiN^ has joined #archiveteam-bs
ranavalon has quit IRC (Quit: Leaving)
Jusque has quit IRC (Quit: ZNC - http://znc.in)
Jusque has joined #archiveteam-bs
Jusque has quit IRC (Client Quit)
Jusque has joined #archiveteam-bs
odemg has quit IRC (Ping timeout: 260 seconds)
mnjgno has quit IRC (Quit: Leaving)
odemg has joined #archiveteam-bs