Stay Hungry,Stay Foolish!

Why the Pipe Character “|” Works in LangChain’s LCEL

Why the Pipe Character “|” Works in LangChain’s LCEL

http://cncc.bingj.com/cache.aspx?q=python+pipe+operator&d=4965480815663428&mkt=en-US&setlang=en-US&w=ZTsip_Llmj7SCg1Xnjy71UfpBFEYqgVM

Introduction

In LangChain, it is now recommended to describe Chains using the LangChain Expression Language (LCEL), which utilizes the pipe character “|” similar to Linux pipes. However, in Python, “|” typically acts as a bitwise OR operator, producing a logical OR result. It was unclear how LCEL assigns a unique operational function to “|”, so I investigated.

chain = prompt | model | outputparser # Investigated this "|"
chain.invoke("Question.")

 

[Background] Operator Overloading

In Python, users can change the behavior of operators for their classes by declaring special methods like __eq__.

To define “|”, you would use __or__ and __ror__:

    __or__: Executed when on the left side of "|". In A|B, A's __or__ is called.
    __ror__: Executed when on the right side of "|". In A|B, B's __ror__ is called.

 

Examples
Example 1: Checking __or__ Behavior

Declare __or__ in classes A and B. When "|" is operated in A, it outputs "A's __or__ method is called". In B, it outputs "B's __or__ method is called".

class A:
    def __init__(self, value):
        self.value = value
    def __or__(self, other):
        print("A's __or__ method is called")
        return self.value | other.value

class B:
    def __init__(self, value):
        self.value = value
    def __or__(self, other):
        print("B's __or__ method is called")
        return self.value | other.value

objA = A(2)
objB = B(3)
result = objA | objB

Output:

A's __or__ method is called

This shows A’s “|” was executed.

If reversed,

result = objB | objA
print(result)

the output is:

B's __or__ method is called

indicating B’s “|” was executed. The __or__ of the object before "|" is executed.
Example 2: Checking __ror__ Behavior

Declare __ror__ only in class B.

class A:
    def __init__(self, value):
        self.value = value

class B:
    def __init__(self, value):
        self.value = value
    def __ror__(self, other):
        print("B's __ror__ method is called")
        return self.value | other.value

objA = A(2)
objB = B(3)
result = objA | objB

Output:

B's __ror__ method is called

The __ror__ of the object after "|" was executed.
Example 3: Checking Priority of __ror__ and __or__

Declare __or__ in class A and __ror__ in class B.

class A:
    def __init__(self, value):
        self.value = value
    def __or__(self, other):
        print("A's __or__ method is called")
        return self.value | other.value

class B:
    def __init__(self, value):
        self.value = value
    def __ror__(self, other):
        print("B's __ror__ method is called")
        return self.value | other.value

objA = A(2)
objB = B(3)
result = objA | objB

Output:

A's __or__ method is called

If __or__ is declared in the class before "|", it is executed. The __ror__ of the class after "|" is ignored.

 

[Main Topic] Examining Operator Overloading in LangChain’s Source Code

In LCEL, components like prompt, model, and outputparser are all based on the Runnable class. So, we looked into __or__ and __ror__ methods of the Runnable class.

class Runnable(Generic[Input, Output], ABC):  # Excerpt
    def __or__(
            self,
            other: Union[
                Runnable[Any, Other],
                Callable[[Any], Other],
                Callable[[Iterator[Any]], Iterator[Other]],
                Mapping[str, Union[Runnable[Any, Other], Callable[[Any], Other], Any]],
            ],
        ) -> RunnableSerializable[Input, Other]:
            """Compose this runnable with another object to create a RunnableSequence."""
            return RunnableSequence(self, coerce_to_runnable(other))

    def __ror__(
        self,
        other: Union[
            Runnable[Other, Any],
            Callable[[Other], Any],
            Callable[[Iterator[Other]], Iterator[Any]],
            Mapping[str, Union[Runnable[Other, Any], Callable[[Other], Any], Any]],
        ],
    ) -> RunnableSerializable[Other, Output]:
        """Compose this runnable with another object to create a RunnableSequence."""
        return RunnableSequence(coerce_to_runnable(other), self)

Executing self(Runnable object)|other generates and returns a RunnableSequence(self, coerce_to_runnable(other)) object. other|self(Runnable object) is also possible.

coerce_to_runnable used here converts non-Runnable standard Python components to Runnable ones but only accepts Runnable, callable, or dict. Anything else raises an exception.

def coerce_to_runnable(thing: RunnableLike) -> Runnable[Input, Output]:
    """Coerce a runnable-like object into a Runnable.

    Args:
        thing: A runnable-like object.

    Returns:
        A Runnable.
    """
    if isinstance(thing, Runnable):
        return thing
    elif inspect.isasyncgenfunction(thing) or inspect.isgeneratorfunction(thing):
        return RunnableGenerator(thing)
    elif callable(thing):
        return RunnableLambda(cast(Callable[[Input], Output], thing))
    elif isinstance(thing, dict):
        return cast(Runnable[Input, Output], RunnableParallel(thing))
    else:
        raise TypeError(
            f"Expected a Runnable, callable or dict."
            f"Instead got an unsupported type: {type(thing)}"
        )

So, it’s clear that because __or__ and __ror__ are declared in Runnable, we can use expressions like Runnable object | other (Runnable object, callable, or dict) or other | Runnable object. The returned RunnableSequence is then invoked.

Practical Use of “|” in LCEL

Try callable or dict | Runnable object.

 

from langchain_core.runnables import RunnableLambda
from operator import itemgetter

# Function that returns the length of text
def length_function(text):
    return len(text)

# In the following chain, the itemgetter fetches the value with key="foo" and passes it to the length_function.
chain = itemgetter("foo") | RunnableLambda(length_function)

# Output is 2 ("aa" has 2 characters).
chain.invoke({"foo":"aa"})

Key points:

    Use the callable itemgetter.
    Wrap length_function in RunnableLambda to make it Runnable.

However, the following results in an error:

# Error
chain = {"foo":"aa"} | RunnableLambda(length_function)
chain.invoke({"foo":"aa"})

In LCEL, dictionary values must also be runnable, callable, or a dict, likely checked recursively. If you need to include a dictionary in the chain, use the following:

chain = (lambda x: {"foo":"aa"}) | RunnableLambda(length_function)
chain.invoke({"foo":"aa"})

This makes the entire dictionary part callable, avoiding issues. The argument x in the lambda, which is the input to invoke, is unused here and thus discarded.

How the Pipe Operator Works

https://www.pinecone.io/learn/series/langchain/langchain-expression-language/

 

from langchain_core.runnables import (
    RunnableParallel,
    RunnablePassthrough
)

retriever_a = vecstore_a.as_retriever()
retriever_b = vecstore_b.as_retriever()

prompt_str = """Answer the question below using the context:

Context: {context}

Question: {question}

Answer: """
prompt = ChatPromptTemplate.from_template(prompt_str)

retrieval = RunnableParallel(
    {"context": retriever_a, "question": RunnablePassthrough()}
)

chain = retrieval | prompt | model | output_parser

 

 

| 结合 pipe库做数据预处理

https://mathdatasimplified.com/write-clean-python-code-using-pipes-3/

https://www.the-analytics.club/pipe-operations-in-python/

https://zhuanlan.zhihu.com/p/432755818

 

posted @ 2024-05-22 09:43  lightsong  阅读(6)  评论(0编辑  收藏  举报
Life Is Short, We Need Ship To Travel