Blog

Takeaways from the LLMs in Production Conference

Xebia Background Header Wave

Because answers from LLMs are polished, draw on a large repository of facts, and we tend to ask it the sorts of generic and shallow questions where its pattern library serves best, we are fooled into thinking that they are more generally capable than they actually are.  

At the same time, they are in fact highly capable of a wide variety of genuinely useful tasks. Most of their output is intellectually shallow, but many real-world tasks are shallow.  

To get past the buzz and to gain real value we need to bring these models into production. That’s why we held a watch party for the recent LLMs in Production conference. Here are our key takeaways. 

 

For me, the main takeaway was the multitude of engineering challenges and opportunities that have been presented by the recent developments in AI. LLMs and foundation models create limited value on their own. It’s the way in which the user interacts with them that makes them useful. The chat interface was already a step forward for LLMs, as more context could be provided to the model using the chat history. But should user interaction only occur through natural language and dialog? During the conference, multiple alternatives were presented that require engineering efforts to connect LLMs nonlinguistic inputs and outputs.

Linus’ talk “Generative Interfaces beyond Chat” provided some interesting alternatives for providing context to AI assistants in software applications. What is the user currently looking at? Which menus are open? Which actions did the user take recently? Translating this nonlinguistic information into natural language and feeding it to LLMs through the prompt could greatly improve the user experience. For example, a user working on a spreadsheet can be assisted more effectively when the LLM is aware of which cells have been selected.

There are opportunities on the output side of the LLM too. Harrison Chase’s talk on tools for LLMs was particularly interesting. Providing answers to questions is useful, but still requires action on the side of the user. Having LLMs use tools to search the web, execute Python scripts and interact with other agents expands their capabilities beyond chat. But this is not a trivial feat, since language models by design can only generate text. To use tools with LLMs therefore requires an additional engineering layer that translates instructions into the corresponding actions.

– Jochem Loedeman 

 

One of the key takeaways for me was the importance of prompt hacking. It is a paramount exercise to increase awareness about prompt hacking and help prevent potential security vulnerabilities. For example, a specific prompt might lead to the exposure of secret information via a reply by the chatbot. Such an issue is particularly relevant given the increasing number of use cases that fully rely on automation and won’t notice prompt hacking because of a lack of human supervision.

Another takeaway concerns the Large Language Model use cases. While the application in chatbots is obvious, there are many other promising directions to follow. Natural language can serve as a mere (and democratized) interface to solve tasks that can be too technical and time-consuming to execute (say, writing API calls, building boilerplate code, sketching images/diagrams, etc), yet trivial for humans to understand and validate.

As a data scientist, I see generative AI as a productivity tool rather than a job replacement. While LLMs are powerful, humans must remain in the loop. This paradigm shift allows for higher-level thinking with focus on business solutions.

– Caio Benatti Moretti 

 

My takeaways from the Vector Databases and Large Language Models talk:

  • Current LLM models are often way more useful once they can make use of contextually appropriate data for tackling domain-specific tasks.
    • However, the traditional way of resolving this issue, through retraining or fine-tuning the models, is often unfeasible. For example, due to the high cost or lack of the necessary expertise. Or simply because the model is locked away behind an API.
    • Luckily, there is another way you can bring contextually appropriate data into the mix: by making use of vector embeddings. These embeddings create a representation of some piece of (unstructured) data that is searchable using similarity metrics.
    • By giving your generative model access to a vector database containing embeddings of useful data, you allow it to search and retrieve information necessary for solving the task at hand.
  • Example use cases that can benefit from giving your model access to a vector database with relevant embeddings are:
    • Contextual retrieval; by allowing your model to search and retrieve relevant pieces of information from a “knowledge base”. This is useful for document discovery and retrieval solutions.
    • LLM Memory; by allowing your model to store, search and retrieve pieces of its conversations. This can be used for developing chatbots that require previous conversation context.
    • LLM Caching; by allowing your model to cache queries and corresponding responses and use this as a cache for semantic similar requests. This saves computational and monetary costs, and can significantly speed up your solution.
  • TLDR; you can supercharge your LLMs by connecting them to a vector database because it allows them to leverage custom, contextually appropriate data.

– Stijn Tonk 

  

To conclude our takeaways, as data scientist or machine learning engineer I see no reason to start worrying about your job.  

Recent advances in LLMs are leading to more and more automation. It is already possible to write a single prompt to create a complete model pipeline in python. But before we completely handover our jobs to prompt engineers don’t forget that there is still quite some work to do before a model is actually delivering value to your organisation. 

Many things are not that easy to automate yet: translating a business problem into a data product, designing the feedback loop for continuous learning, tying the pieces together and choosing the right tooling for the job. 

A good business sense, strong communication and technical skills to orchestrate it all will be key to stay relevant in our industry. 

Questions?

Get in touch with us to learn more about the subject and related solutions

Explore related posts