Broken Glass
There are a lot of potential applications for AI in business, but it often seems like that’s about as specific as it gets, which – when you’re trying to work out if AI is right for your business, or how to get your feet wet without jumping in from the high-board – can be a bit too vague. “Automation can sort your finance processes” great... but how exactly? We use Xero and HubSpot, both of which offer “AI” elements. What is this exact “automation” you speak of? That’s the kind of thing that can put you off trying things out in the first place.
And so. I want to give a real-world example of using AI to solve a problem. You can then try it yourself and perhaps you can translate it into a real-world business need. For this example, I’m going to focus on the power of Visual Analysis.
We’ve all seen the memes of LLMs failing to identify or misidentifying an animal within a photograph, but much of those limitations – just like the video of Will Smith eating spaghetti – have been overcome in recent months. We are now [and have been for a little while] at the point where most LLMs can successfully analyse and document the contents of images to a high level of accuracy. This is of course dependent on the complexity of the scene and prompting of the LLM, but the “limitation” now is more often pointed at the user and their ability to ask the right question.
Which brings us to Broken Glass.
Because the most important thing to do before jumping into using a tool is to identify the problem you are planning to use the tool to fix. This is the number 1 reason so many “AI initiatives” or adoption of some new tool fails. The “problem” was never identified, and the “tool” was just added to the catalogue of digital solutions that “are going to be great” but ultimately solved nothing.
So why broken glass? Well, it’s a pretty binary place to start.
Let’s imagine I run a glassware business, delivering glasses to various catering firms around the country. With each delivery I have a quality control process in place where the delivery company must photograph each glass individually and send me the photo to verify whether the glass is broken or not. So, I have a bunch of photos of glasses, some a broken and some are not.
I am the only person in my business qualified to determine whether a glass is broken or not, so it lands with me to check these individually. The issue is that I am on holiday, but don’t want to hold up progress on the glass distribution. I know I said, “real world example”, but I’m sure there are similar binary states you can think of for visual proof within your business. For now, we have the glasses. So, I have entrusted the process to my client team who have some spare capacity to look at the photos, but not the expertise to determine their state. Now, images all in hand and with ChatGPT open, I have provided them with the following prompt which, after they have entered, they can simply share an image and get the analysis back.
You are an expert image analysis assistant.
Your task is to analyse an image and answer ONLY the following two questions:
1. Does the image contain a **drinking glass**?
2. If a drinking glass is present, is **any** part of the glass **broken, chipped, or cracked**?
By “drinking glass”, we mean a vessel used to drink liquids such as water, wine, juice, or beer. Ignore other uses of glass, such as windowpanes, eyeglasses, glass tables, or decorative objects.
Respond using this **exact format** with **only “Yes” or “No”** for each line:
Drinking glass present: [Yes/No]
Broken glass: [Yes/No]
Rules:
- If **no drinking glass** is present, answer `Broken glass: No`.
- If **multiple glasses** are present, and **any one** of them is broken, answer `Broken glass: Yes`.
- Do **not** explain your answer.
- Do **not** describe the image.
- Do **not** use any words other than the specified two-line format.
Be literal, strict, and precise in your interpretation. Only classify drinking vessels.
And just like that I have an intelligent bot / service / agent / interface / whatever you want to call it, that can successfully identify those instances where there is a broken glass, and one that will not give me a commentary of the content of the image or misinterpret images that are not containing glasses.
Like I said, the Broken Glasses analysis may not be of immediate need, and lots of people can identify whether a glass is broken or not, but hopefully this illustrates how easy it can be for certain – more skilled – analysis of images / environments. What if you could prompt an LLM to successfully identify if a room contained a window, or a smoke alarm, or a TV, just from an image? Or maybe something more granular; what if the image was of the blades of a rack-mount server, with each blade responsible for a particular role, and you wanted to identify the roles each server is able to play.
I am sure there are examples in your industry – as there are many within AEC – where this kind of functionality would be useful. And this is how easy it can be to replicate.
Of course, there are next-steps you might want to investigate after proving this to be a valid and useful tool, like adding it to an App, so your teams can take a photo directly on site, then have the analysis fed-back in real time, without the need for your oversight. Of perhaps a simple web front-end to allow for batch-uploading of images to be processed and the results added to a spreadsheet to quickly analyse the anomalies. These are again very simple but helpful examples of tooling that can be put together in little time and that can help save time straight away.
If you’re already thinking about your version of broken glass, drop me a note. I’d be happy to share how I’ve helped others turn these kinds of prompts into working tools.