Podcast Scraping: Radiolab

Radiolab seems to make their page out of javascript. that made it slightly more annoying to find the right download link, but otherwise it's straight forward.


import time
import requests
from bs4 import BeautifulSoup
import os.path
def download(href, title, extension="mp3", dirname='.'):
    print(href, title, extension, dirname)
    filename = "%s.%s" % (title, extension)
    filename = filename.replace("/", "-")
    # todo, path management
    local_filename = os.path.join(dirname, filename)
    if not os.path.exists(dirname):
        print("making dir %s" % dirname)
        os.makedirs(dirname)
    local_filename = os.path.join(dirname, filename)
    r = requests.get(href, stream=True)
    with open(local_filename, 'wb') as f:
        for ...

read more...


Podcast Scraping: 99Percent Invisible

Today in desert island podcast scraping:

99 Percent Invisible

import time
import requests
from bs4 import BeautifulSoup
import os.path
def download(href, title, extension="mp3", dirname='.'):
    filename = "%s.%s" % (title, extension)
    filename = filename.replace("/", "-")
    # todo, path management
    local_filename = os.path.join(dirname, filename)
    if not os.path.exists(dirname):
        print("making dir %s" % dirname)
        os.makedirs(dirname)
    local_filename = os.path.join(dirname, filename)
    r = requests.get(href, stream=True)
    with open(local_filename, 'wb') as f:
        for chunk in r.iter_content(chunk_size=1024):
            if chunk: # filter out keep-alive new chunks
                f.write(chunk)
    return local_filename
for i in range(1 ...

read more...


Desert Island Two: Planet Money

Desert Island Media Prep Day Two: Planet Money.



#https://www.npr.org/sections/money/127413729/planet-money/archive?date=12-31-2017
## Check out the date at the end? last day of month is the archive page
# Weird episode numbering, going to make folders so they stay in roughly date order
import time
import requests
from bs4 import BeautifulSoup
import os.path
def download(href, title, extension="mp3", dirname='.'):
    filename = "%s.%s" % (title, extension)
    filename = filename.replace("/", "-")
    # todo, path management
    local_filename = os.path.join(dirname, filename)
    if not os.path.exists(dirname):
        print("making dir %s" % dirname)
        os.makedirs(dirname)
    local_filename = os.path ...

read more...


Desert Island: The Dollop

I'm going on a trip. The internet will probably be pretty horrible, at least at first.


I've been asking everyone their Desert Island Recommendations. One of mine is The Dollop. This is how you download every episode of The Dollop with a little bit of python.


You have to install python3, requests, and beautifulsoup4. I won't help you with that, but there are a load of good resources.


import time
import requests
from bs4 import BeautifulSoup
def download(href, title, extension="mp3"):
    filename = "%s.%s" % (title, extension)
    filename = filename.replace("/", "-")
    # todo, path management
    local_filename = filename
    r = requests.get ...

read more...



Issac Kelly