An interesting little story caught my eye in last Saturday’s Times, writes Newsdirect Wales’ director Valerie Livingston.

AI Trawls 20,000 miles of state papers

(I’d attach a link but – rather ironically – I came across the original story in the physical copy of the paper and it doesn’t appear to be online.)

Times Policy Editor, Oliver Wright, writes that UK Cabinet Office has been working with data science consultancy Atchai on a number of prototype projects to harness the data which lies in government documents.

There is apparently a 20,000 mile long trail of paper, uncatalogued and unsearchable, sitting in government archives. In addition to trivial notes and lists, there is a wealth of research in briefing notes and academic papers. As politicians and civil service personnel change, this information gets lost or forgotten.

The UK Government’s intention is to improve decision making by restoring access to this data. The project has already run into one challenge particularly familiar to the teams at Newsdirect – the document formats previously used in government are not always conducive to searching but Whitehall sources seem confident this can be overcome.

AI for the many

Accessibility of data is changing and so are the tools to work with it.

Last week also saw the launch of Google Pinpoint, a suite of tools to support journalists working with large file sets.

Combining AI, machine learning, optical character recognition and speech-to-text technologies, Pinpoint can identify and highlight people, locations and organisations throughout thousands of pages of text. The days of CTRL + F may well be numbered.

And while the UK Government’s digital documents project is aimed at getting research into the hands of ministers, any moves towards making more data available in searchable formats has exciting implications for scrutiny.

At Newsdirect, we’re looking forward to it.