This project leverages LangChain, Groq API, and other libraries to provide an intuitive interface for summarizing text content from YouTube videos, websites, or PDF files.
- YouTube Transcript Summarization: Extracts and summarizes the transcript of a YouTube video.
- Website Content Summarization: Fetches and summarizes text content from a given URL.
- PDF Summarization: Summarizes the content of uploaded PDF documents.
-
Input Options:
- Provide a YouTube video URL.
- Enter a generic website URL.
- Upload a PDF file.
-
Processing:
- For YouTube videos, it extracts the transcript using the
YouTubeTranscriptApi
. - For websites, it fetches text content using
UnstructuredURLLoader
. - For PDFs, it processes the uploaded file using
PyPDFLoader
.
- For YouTube videos, it extracts the transcript using the
-
Summarization:
- A pre-defined prompt is used to generate a concise summary of the content using the
ChatGroq
LLM.
- A pre-defined prompt is used to generate a concise summary of the content using the
-
Output:
- The summarized text is displayed on the Streamlit app interface.
- Streamlit: For building the web interface.
- LangChain: For chaining and managing prompts.
- ChatGroq: A powerful LLM API for generating summaries.
- YouTubeTranscriptApi: For fetching YouTube video transcripts.
- PyPDFLoader: For reading and processing PDF files.
- UnstructuredURLLoader: For extracting content from websites.
-
Clone the repository:
git clone https://github.com/your-repo-name.git cd your-repo-name
-
Install the required dependencies:
pip install -r requirements.txt
-
Run the Streamlit app:
streamlit run app.py
-
Open the app in your browser and provide your Groq API key in the sidebar.
-
Input a URL (YouTube/website) or upload a PDF file and click "Summarize the Content from YT, Website, or PDF."
Deployed the application on HuggingFace: LangChain Summarizer App
app.py
: Main application file.requirements.txt
: List of required Python packages.
- Only works with YouTube videos that have transcripts enabled.
- Summarization quality depends on the provided content and LLM capabilities.
- Requires a valid Groq API key to function.
This project is licensed under the MIT License.