When LLMs become influencers | InfoWorld

February 10, 2025

106

Who trains the trainers?

Our ability to influence LLMs is seriously circumscribed. Perhaps if you’re the owner of the LLM and associated tool, you can exert outsized influence on its output. For example, AWS should be able to train Amazon Q to answer questions, etc., related to AWS services. There’s an open question as to whether Q would be “biased” toward AWS services, but that’s almost a secondary concern. Maybe it steers a developer toward Amazon ElastiCache and away from Redis, simply by virtue of having more and better documentation and information to offer a developer. The primary concern is ensuring these tools have enough good training data so they don’t lead developers astray.

For example, in my role running developer relations for MongoDB, we’ve worked with AWS and others to train their LLMs with code samples, documentation, etc. What we haven’t done (and can’t do) is ensure that the LLMs generate correct responses. If a Stack Overflow Q&A has 10 bad examples and three good examples of how to shard in MongoDB, how can we be certain a developer asking GitHub Copilot or another tool for guidance gets informed by the three positive examples? The LLMs have trained on all sorts of good and bad data from the public Internet, so it’s a bit of a crapshoot as to whether a developer will get good advice from a given tool.

Microsoft’s Victor Dibia delves into this, suggesting, “As developers rely more on codegen models, we need to also consider how well does a codegen model assist with a specific library/framework/tool.” At MongoDB, we regularly evaluate how well the different LLMs address a range of topics so that we can gauge their relative efficacy and work with the different LLM vendors to try to improve performance. But it’s still an opaque exercise without clarity on how to ensure the different LLMs give developers correct guidance. There’s no shortage of advice on how to train LLMs, but it’s all for LLMs that you own. If you’re the development team behind Apache Iceberg, for example, how do you ensure that OpenAI is trained on the best possible data so that developers using Iceberg have a great experience? As of today, you can’t, which is a problem. There’s no way to ensure developers asking questions (or expecting code completion) from third-party LLMs will get good answers.

Previous articleHow Technology is Transforming Online Casinos & Betting in India

Next articleHoneywell to Split Into Three Companies, Reshaping Its Future

When LLMs become influencers | InfoWorld

Who trains the trainers?

Related Articles

How Do CIOs Protect Global Performance?

AWS launches ‘Capabilities by Region’ to simplify planning for cloud deployments

We can’t ignore cloud governance anymore

LEAVE A REPLY Cancel reply

CATEGORIES & TAGS

LATEST COMMENTS

Most Popular

Understanding Plex UDP Amplification DDoS Attack

Major Tech Layoffs in 2024: An Updated Tracker

Addressing the Skills Gap to Keep Up with the Evolution of the Cloud

How Automotive Radars Are Advancing Safety Features

What Can IT Executives Do to Improve Mental Health for Themselves and Their Teams?

When LLMs become influencers | InfoWorld

Related Articles

LEAVE A REPLY Cancel reply

Stay Connected

CATEGORIES & TAGS

LATEST COMMENTS

Most Popular