Continue on previous old repo by antinucleon
Unlink https://github.com/coursera-dl/edx-dl
, this codebase is <300 lines. Less time to tinker.
Also, can visually cheat. So, it's more adaptive to future changes (russia ban). I mean, chrome is magic
Imagine you right click the element to copy xpath. Rather than, Scratch ur head to debug into a maze, which might fail anywhere anytime...(of course)
spent a few hours to get it working
-
download course BEM1105x video slides quzz snapshot is not perfect (cutoff)
-
seperate folder as Unit
-
full automation
Linux, Ubuntu (see below) Python virtual env (optional)
-
Edit homebrew locatoin on Linux If you are on Mac or aria2c is already in PATH, change accordingly. default
/home/linuxbrew/.linuxbrew/bin/aria2c
) due to pycharm can't find PATH -
Download
chromedriver
and put it into./bin
-
Install aria2
-
Run
pip install -r requirements.txt
-
Rename
settings.sample.yaml
tosettings.yaml
-
Edit
settings.yaml
to have your credentials. Don't share your password. -
Run
python edx-dl.py
, input user name, password, and course url (eg:https://learning.edx.org/course/course-v1:CaltechX+BEM1105x+3T2020/home
) -
Wait for a while for chrome to open up
- Headless mode may be buggy due to lacking wait for some loadings. However I am lazy to make it correct.
kaggle master: https://bingxu.io/20200724/about.html
well, now we use latest api.