-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Langchain using chain.stream could not shown result on trulens dashboard #1357
Comments
Hey, @zongyicoding! I'm here to assist you with any bugs, questions, or contributions. Let me know how I can help you with the Langchain issue. To correctly wait for the LLM output to be completed when using
Here's an updated version of your code to achieve this: from fastapi import FastAPI, Request
from fastapi.responses import StreamingResponse, FileResponse
from langchain_openai import ChatOpenAI
from langchain.schema import Document
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores.chroma import Chroma
from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from trulens_eval import TruChain, Feedback
from trulens_eval.feedback.provider import OpenAI
from trulens_eval import Tru
from langchain.callbacks import AsyncIteratorCallbackHandler
import asyncio
tru = Tru()
tru.run_dashboard(port=9999, force=True)
embedding_function = OpenAIEmbeddings(
model="text-embedding-ada-002",
check_embedding_ctx_length=False
)
docs = [
Document(
page_content="the dog loves to eat pizza", metadata={"source": "animal.txt"}
),
Document(
page_content="the cat loves to eat lasagna", metadata={"source": "animal.txt"}
),
]
db = Chroma.from_documents(docs, embedding_function)
retriever = db.as_retriever()
template = """Answer the question based only on the following context:
{context}
Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
model = ChatOpenAI(temperature=0, streaming=True, model='gpt-3.5')
provider = OpenAI()
retrieval_chain = (
{"context": retriever, "question": RunnablePassthrough()}
| prompt
| model
| StrOutputParser()
)
f_answer_relevance = (
Feedback(provider.context_relevance_with_cot_reasons, name="Answer Relevance")
.on_input_output()
)
tru_recorder = TruChain(
retrieval_chain,
app_id='conversation_stream',
feedbacks=[f_answer_relevance],
feedback_mode="deferred"
)
app = FastAPI()
async def generate_chat_responses(message):
callback = AsyncIteratorCallbackHandler()
with tru_recorder as recording:
task = asyncio.create_task(
retrieval_chain.acall(
inputs=dict(question=message),
callbacks=[callback]
)
)
response = ""
async for token in callback.aiter():
content = token.replace("\n", "<br>")
print(f'content: {content}')
response += content
yield f"data: {content}\n\n"
await task
record = recording.get()
print(f'Record: {record}')
@app.get("/")
async def root():
return FileResponse("static/index.html")
@app.get("/chat_stream/{message}")
async def chat_stream(message: str):
return StreamingResponse(generate_chat_responses(message=message), media_type="text/event-stream")
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="127.0.0.1", port=8000) Key Changes:
This approach ensures that the streaming outputs are correctly captured and recorded in the Trulens dashboard, providing the correct score after the LLM generation is completed [1][2]. |
hi @dosu.
In my experimental environment, I need to use internal api_key and base_url. Is there any way to solve this problem? |
To resolve the issues with using the internal
By following these steps, you should be able to resolve the issues with using the internal |
Hi @sfc-gh-jreini,
from fastapi import FastAPI, Request
from fastapi.responses import StreamingResponse, FileResponse
from langchain_openai import ChatOpenAI
from langchain.schema import Document
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores.chroma import Chroma
from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from trulens_eval import TruChain, Feedback
from trulens_eval.feedback.provider import OpenAI
from trulens_eval import Tru
from langchain.callbacks import AsyncIteratorCallbackHandler
import asyncio
import os
from dotenv import load_dotenv
load_dotenv()
tru = Tru()
tru.reset_database()
tru.run_dashboard(port=9999, force=True)
embedding_function = OpenAIEmbeddings(
model="text-embedding-ada-002",
check_embedding_ctx_length=False,
api_key=os.environ['API_KEY']
)
docs = [
Document(
page_content="the dog loves to eat pizza", metadata={"source": "animal.txt"}
),
Document(
page_content="the cat loves to eat lasagna", metadata={"source": "animal.txt"}
),
]
db = Chroma.from_documents(docs, embedding_function)
retriever = db.as_retriever()
template = """Answer the question based only on the following context:
{context}
Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
model = ChatOpenAI(temperature=0, streaming=True, api_key=os.environ['API_KEY'],model='gpt-3.5')
provider = OpenAI(api_key=os.environ['API_KEY'])
retrieval_chain = (
{"context": retriever, "question": RunnablePassthrough()}
| prompt
| model
| StrOutputParser()
)
f_answer_relevance = (
Feedback(provider.context_relevance_with_cot_reasons, name="Answer Relevance")
.on_input_output()
)
tru_recorder = TruChain(
retrieval_chain,
app_id='conversation_stream',
feedbacks=[f_answer_relevance],
feedback_mode="deferred"
)
app = FastAPI()
async def generate_chat_responses(message):
callback = AsyncIteratorCallbackHandler()
with tru_recorder as recording:
task = asyncio.create_task(
retrieval_chain.ainvoke(
input=dict(question=message),
callbacks=[callback]
)
)
response = ""
async for token in callback.aiter():
content = token.replace("\n", "<br>")
print(f'content: {content}')
response += content
yield f"data: {content}\n\n"
await task
record = recording.get()
print(f'Record: {record}')
@app.get("/")
async def root():
return FileResponse("static/index.html")
@app.get("/chat_stream/{message}")
async def chat_stream(message: str):
return StreamingResponse(generate_chat_responses(message=message), media_type="text/event-stream")
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="127.0.0.1", port=8080)
Please assist in addressing these issues. Thank you. |
To resolve the issues where the answers are no longer accessible from the stream loop and the Trulens dashboard is storing multiple records without the actual answer, you should ensure that you are using the correct version of the The release notes for version 0.31.0 mention support for Langchain >=0.2.x retrievers via instrumenting the ## 0.31.0
### What's Changed
* Support for langchain >=0.2.x retrievers via instrumenting the `invoke` method by @nicoloboschi in https://github.com/truera/trulens/pull/1187 Additionally, you can refer to the example implementation provided in the Here is a relevant snippet from the example: import os
os.environ['PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION'] = 'python'
from langchain.chains import ConversationalRetrievalChain
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.memory import ConversationSummaryBufferMemory
from langchain_community.callbacks import get_openai_callback
from langchain_community.llms import OpenAI
from langchain_community.vectorstores import Pinecone
import numpy as np
import pinecone
import streamlit as st
from trulens_eval import feedback
from trulens_eval import Select
from trulens_eval import tru
from trulens_eval import tru_chain_recorder
from trulens_eval.feedback import Feedback
from trulens_eval.keys import check_keys
check_keys("PINECONE_API_KEY", "PINECONE_ENV", "OPENAI_API_KEY")
# Set up GPT-3 model
model_name = "gpt-3.5-turbo"
app_id = "TruBot"
# Pinecone configuration.
pinecone.init(
api_key=os.environ.get("PINECONE_API_KEY"), # find at app.pinecone.io
environment=os.environ.get("PINECONE_ENV") # next to api key in console
)
identity = lambda h: h
hugs = feedback.Huggingface()
openai = feedback.OpenAI()
f_lang_match = Feedback(hugs.language_match).on(
text1=Select.RecordInput, text2=Select.RecordOutput
)
f_qa_relevance = Feedback(openai.relevance).on(
prompt=Select.RecordInput, response=Select.RecordOutput
)
f_qs_relevance = Feedback(openai.qs_relevance).on(
question=Select.RecordInput,
statement=Select.Record.chain.combine_docs_chain._call.args.inputs.
input_documents[:].page_content
).aggregate(np.min)
def generate_response(prompt):
# Embedding needed for Pinecone vector db.
embedding = OpenAIEmbeddings(model='text-embedding-ada-002') # 1536 dims
docsearch = Pinecone.from_existing_index(
index_name="llmdemo", embedding=embedding
)
retriever = docsearch.as_retriever()
# LLM for completing prompts, and other tasks.
llm = OpenAI(temperature=0, max_tokens=128)
# Conversation memory.
memory = ConversationSummaryBufferMemory(
max_token_limit=650,
llm=llm,
memory_key="chat_history",
output_key='answer'
)
# Conversational chain puts it all together.
chain = ConversationalRetrievalChain.from_llm(
llm=llm,
retriever=retriever,
return_source_documents=True,
memory=memory,
get_chat_history=identity,
max_tokens_limit=4096
)
# Language mismatch fix:
if "langprompt" in app_id:
chain.combine_docs_chain.llm_chain.prompt.template = \
"Use the following pieces of CONTEXT to answer the question at the end " \
"in the same language as the question. If you don't know the answer, " \
"just say that you don't know, don't try to make up an answer.\n" \
"\n" \
"CONTEXT: {context}\n" \
"\n" \
"Question: {question}\n" \
"Helpful Answer: "
elif "relevance" in app_id:
# Contexts fix
chain.combine_docs_chain.llm_chain.prompt.template = \
"Use only the relevant contexts to answer the question at the end " \
". Some pieces of context may not be relevant. If you don't know the answer, " \
"just say that you don't know, don't try to make up an answer.\n" \
"\n" \
"Contexts: \n" \
"{context}\n" \
"\n" \
"Question: {question}\n" \
"Helpful Answer: "
# space is important
chain.combine_docs_chain.document_prompt.template = "\tContext: {page_content}"
# Trulens instrumentation.
tc = tru_chain_recorder.TruChain(chain, app_id=app_id)
return tc, tc.with_record(dict(question=prompt))
# Set up Streamlit app
st.title("TruBot")
user_input = st.text_input("Ask a question about TruEra")
if user_input:
# Generate GPT-3 response
prompt_input = user_input
# add context manager to capture tokens and cost of the chain
with get_openai_callback() as cb:
chain, (response, record) = generate_response(prompt_input)
total_tokens = cb.total_tokens
total_cost = cb.total_cost
answer = response['answer']
# Display response
st.write(answer)
record_id = tru.add_data(
app_id=app_id,
prompt=prompt_input,
response=answer,
record=record,
tags='dev',
total_tokens=total_tokens,
total_cost=total_cost
)
# Run feedback function and get value
feedbacks = tru.run_feedback_functions(
app=app,
record=record,
feedback_functions=[f_lang_match, f_qa_relevance, f_qs_relevance]
)
# Add value to database
tru.add_feedback(record_id, feedbacks) Ensure that your implementation aligns with the updated method signatures and functionalities provided in this version. If you are still encountering issues, it might be beneficial to review the specific changes made in the pull request mentioned above (#1187) to understand the exact modifications and how they should be integrated into your code. Additionally, you can refer to the from trulens_eval import Feedback
from trulens_eval import Tru
from trulens_eval import TruChain
from trulens_eval.feedback import OpenAI as fOpenAI
tru = Tru()
from datetime import datetime
from datetime import timedelta
from typing import Type
from langchain import SerpAPIWrapper
from langchain.agents import AgentType
from langchain.agents import initialize_agent
from langchain.agents import Tool
from langchain.chat_models import ChatOpenAI
from langchain.tools import BaseTool
import openai
from pydantic import BaseModel
from pydantic import Field
import yfinance as yf
# Setup API keys
import os
os.environ["OPENAI_API_KEY"] = "sk-..."
os.environ["SERPAPI_API_KEY"] = "..."
# Create agent with search tool
search = SerpAPIWrapper()
search_tool = Tool(
name="Search",
func=search.run,
description="useful for when you need to answer questions about current events"
)
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
tools = [search_tool]
agent = initialize_agent(
tools, llm,
agent=AgentType.OPENAI_FUNCTIONS,
verbose=True
)
# Set up Evaluation
class OpenAI_custom(fOpenAI):
def no_answer_feedback(self, question: str, response: str) -> float:
return float(self.endpoint.client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "Does the RESPONSE provide an answer to the QUESTION? Rate on a scale of 1 to 10. Respond with the number only."},
{"role": "user", "content": f"QUESTION: {question}; RESPONSE: {response}"}
]
).choices[0].message.content) / 10
custom = OpenAI_custom()
f_no_answer = Feedback(custom.no_answer_feedback).on_input_output()
tru_agent = TruChain(
agent,
app_id="Search_Agent",
feedbacks=[f_no_answer]
)
prompts = [
"What company acquired MosaicML?",
"What's the best way to travel from NYC to LA?",
"How did the change in the exchange rate during 2021 affect the stock price of US based companies?",
"Compare the stock performance of Google and Microsoft",
"What is the highest market cap airline that flies from Los Angeles to New York City?",
"I'm interested in buying a new smartphone from the producer with the highest stock price. Which company produces the smartphone I should by and what is their current stock price?"
]
with tru_agent as recording:
for prompt in prompts:
agent(prompt)
# Explore in a Dashboard
tru.run_dashboard() # open a local streamlit app to explore
# tru.stop_dashboard() # stop if needed This example demonstrates how to set up a Langchain agent, add custom tools, and use TruLens for evaluation and feedback. It might provide additional insights into resolving your issues with the This is a tough nut to crack, could really use your help on this, @joshreini1. |
We have a fixes coming for async and streaming in this PR. Feel free to install it from the branch if you'd like to try it now, otherwise we'll have it in the next release. |
Bug Description
I am currently using langchain's chain.stream to batch output results to the frontend web page. After the output ends, I cannot correctly capture the output and scores of LLM (there is no data at all on the Dashboard). How can I correctly wait for the LLM output to be completed and then record the complete output in the Trulens dashboard and give the correct score?
To Reproduce
Expected behavior
A clear and concise description of what you expected to happen.
The following two behaviors are both acceptable.
Complete the waiting for the generation results of the streaming and record the complete information in the Dashboard.
After the LLM generation is completed, provide the response to Trulens for analysis and save it to the dashboard.
Environment:
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: