-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
We are about to deploy the initial version of the API which currently provides data for the awesome-manim feed on the website (corresponding PR for the website ManimCommunity/manim-website#73)
The scraper currently:
- fetches the list of all YouTube channel links from the README file
- scrapes the publicly available RSS feeds that are available from
https://www.youtube.com/feeds/videos.xml?channel_id=xxxxxxx
. - searches for the substrings
Manim
(case insensitive),#some
(case insensitive),SoME
(case sensitive) in the video title or description. When found, the videos are marked as "being manim videos". - The scraper then puts them on a MySQL database and serves videos chronologically on a paginated endpoint
/videos/n
, 30 videos at a time.
This issue records some ideas we could implement in the future based on feedback.
- A deeper scrape of all the channels (RSS feeds just return the latest 15 videos)
- An algorithmic feed that prioritizes videos with higher engagement, but still retains the chronological ordering to some degree
- ...
Feel free to discuss these and propose any other ideas.
Metadata
Metadata
Assignees
Labels
No labels