Our client is a publishing house that runs several national newspapers and magazines. They regularly deal with a large number of customer requests pertaining to delivery and subscription-related issues and many other topics. Over the past two years, they have identified a 20% increase in the number of customer emails they receive, and an even bigger increase is predicted for the coming years. What do all those requests have in common? They all need to be processed manually. Manual processing involves a lot of time and money and it’s clear that their customers deserve quick and accurate responses.
“Could AI help reduce response times and at the same time improve the relevance of responses? How could we identify relevant figures, or at least estimates, to calculate the return on investment?” These were fundamental questions our client needed answers to. We approached these questions by examining how well an AI-based email classification would cope with real data, specifically our client’s data. The challenge consisted of
- processing emails in varying forms and formats,
- identifying key information to recognize the purpose of the mail and
- conceiving of a way to train the system so that it could learn from “experience”.
Every single email in the publisher’s inbox contains at least one desired outcome or “intent”, i.e. the customer’s motivation when writing to the publisher. As a starting point for the PoC, we picked newspaper delivery issues as a test case.
To investigate what an AI-based solution was capable of, we also picked five “sub-intents” that we would attempt to train our system to identify, such as “newspaper was put in the wrong place”, “newspaper was not delivered” or “newspaper delivery was late”. We found that similar wording in the email text made it challenging to differentiate between subtle nuances in the intent of the email. Overall, our goal was to identify the ratio of correctly identified intents to emails that couldn’t be readily classified.
To gain more insights from the PoC, we trained the system in the short span of two weeks to identify the key information in the email text that the publisher commonly receives by using real data. The information to extract could contain the following elements:
- The Subscription number
- The name of the client
- A shipping address and/or a customer ID
- A time frame for a delivery interruption or a concrete date (for start of delivery)
- The purpose of the email
The information gathered during the development of the prototype allowed us to better demonstrate the challenges and the potential of pursuing AI-powered solutions:
- Pairing natural language understanding (NLU) with partial data is very challenging since third-party AI libraries’ performance varies depending on the language. In our case, IBM Watson turned out to be the best fit for German language processing.
- Understanding dates refer to was one of the biggest hurdles. For example, if a newspaper delivery should be suspended, machines and even humans may be confused if subsequent emails in the communication refer to prior dates on which the suspension should have taken place but didn’t.
- Since emails seldom follow any set formats, a preprocessing of the emails was essential in order to exclude unnecessary data.
With 200 hours spent on pre-processing emails and configuring the AI (classifier, natural language understanding, Watson Knowledge Studio), about 80% of mail intents were correctly classified for the given sample. To us, this meant there is a legitimate potential for response automation and improving email processing times by utilizing artificial intelligence.
Thanks to our PoC, our client was able to better envisage the expected yield from such an investment as well as spot potential risks and identify the next steps that will provide the greatest value for them.