Investment firms – hedge funds, sell-side research groups, traders, and asset managers generally – are increasingly relying on the web (via the process of web scraping) as a source of investment data, a trend that is set to increase rapidly in coming years, according to a new report from Opimas.
‘Web Scraping for Investments - Asset Managers' Bid to Generate Alpha Using the Ultimate Dataset’, authored by Opimas founder and CEO Octavio Marenzi (pictured), estimates that, with 206 billion page views per day in 2018, web scraping for investments already accounts for 5 per cent of all web traffic.
By 2020, Opimas expects that there will be over two billion websites together with trillions of web pages. Asset managers, equity analysts, and traders are increasingly using web-based data and incorporating the resulting information into their decision-making processes.
Total spending on web scraping for investment purposes is expected to exceed USD1.8 billion by 2020, with slightly more than two-thirds comprised of internal spending. Opimas also expects external spending to see accelerated growth in the near term.
The number of web pages accessed daily for web scraping for investments exceeded 10 billion pages per day in 2018. The number of page views is expected to exceed 25 billion per day by 2022.
The heaviest users of web scraping for investments harvest data from more than 100 million web pages daily, while the volume of pages web scraped for investing, rivals the busiest websites in the world, including Facebook and Google.
Uses of web scraped data in the investment process range from sentiment analysis to economic data, price comparators, market data, company financial reports, corporate actions, and reference data. And while the process of web scraping is complex, and web data extraction faces some potential impediments, notably regulation, legal challenges and anti-web scraping technologies meant to prevent the capture of content, Opimas predicts that investment firms will increase their efforts to collate enormous amounts of actionable data.
Thu 21/02/2019 - 10:22
Wed 20/02/2019 - 10:41
Wed 20/02/2019 - 10:24
Wed 20/02/2019 - 09:22
Mon 18/02/2019 - 17:42
Thu 21/02/2019 - 18:29
Thu 21/02/2019 - 09:51
Thu 21/02/2019 - 09:47
Wed 20/02/2019 - 20:58
Wed 20/02/2019 - 10:27