此内容来自:APP亚博娱乐
The Next Level of Alt Data
New sources of information continue to be revealed by technological innovation, placing an emphasis on blending it with experience and investment expertise.
Alternative data isn’t a new idea, but what’s new in alternative data and how it’s being used always makes for an interesting conversation. It’s especially enjoyable if that conversation is withLaurent Laloux,首席产品官,资本基金管理(CFM)。Laloux能解释技术有限公司mplex ideas in an engaging and informative manner, and probably has the most concise and memorable definition of “big data” you’ll ever hear or read about. At a firm that bases every investment strategy on a quantitative and scientific foundation, Laloux also knows that no matter how good the data, what really matters is how it fits into and improves investment strategy performance. II recently spoke with Laloux about big data, alt data, and how natural language processing (NLP) and other AI techniques can help reveal hidden indicators in the market that lead to investment opportunities.
Is it accurate to say that in order to fully leverage alternative data you must first understand the concept of big data? And that big data doesn’t just mean “lots of data?”许多人认为大数据只是一个巨大的数据集,但该术语是错误的。它比纯粹的卷更复杂 - 大数据是一个不断发展的概念。在任何时候,如果它是技术可以与之有关的限制,则特定数据集是“大”。二十年前,从交易所分析逐滴答数据是大数据,因为这是最先进的技术能够处理的。如今,任何现代计算机系统,无论是内部还是云,都可以有效地处理该类型的数据。因此,即使数据集在尺寸方面是巨大的巨大,如果它没有具有挑战性的技术,它也不大。大数据的概念随着技术的创新而随着时间的推移而动情。As a quant manager, we’ve always been at the cutting edge of technology and the analytical techniques it enables along with the growing levels of data, so we’re moving in sync with both to heighten our capacity to measure or understand the economy and financial markets, which in turn helps shape our investment strategies.
What’s an example of the limits of technology and big data today?
We now have an internet of things – all kinds of devices have chips in them, and those devices are connected to networks and they generate a vast amount of data which is only available thanks to this technology. That’s a limit of what technology can manage, and that’s big data – new information that you can catch and analyze.
What’s the current-day relationship between big data and alternative data?
替代数据基本上是与金融市场直接相关的数据,但这是一个关于经济的代理人或指标。它可以是关于船舶或火车货物的数据,来自超市,信用卡信息等的消费总量。由企业,人类和互联网生成的替代数据的语料,让您了解公司,人员和机器正在做的潜在活动。如果您能够管理ALT数据并在其之间建立关系及其对经济和资产的影响,则可能会建立更好的资产管理策略。
值得注意的是,数据不一定是altbig data. Some alternative data can be pretty small and easy to manage, even in an Excel spreadsheet. Social media information is both alternative data and big data. There are more tweets than any structured financial news since the beginning of the 1990s, for example. Most of them are irrelevant regarding the economy, but you have to deal with a massive amount of data to find a needle in a haystack. Storing and managing the fire hose of data on Twitter requires market-leading data management, and extracting market moving information requires topnotch NLP technology.
为什么与CFM这样的Quant公司已经使进化旅程与技术和数据一样重要?
这是通过查看您可以找到统计模式和统计稳定性的历史数据,这使得有机会尝试和预测未来的一点点。没有数据,量子不会存在。在金融市场中工作的大数据科学家的衡量标准是在20世纪80年代出现的易于获得和质量的财务数据的结果,即使在今天也在继续增长。关键是我们在识别最高质量的数据集和技术方面的大数据和技术以及将允许我们在数据中找到有趣模式的技术以及允许的技术。目标是建立模型来预测未来可能会做的金融资产。
沿着数据和技术的旅程,我们已经学会了如何使用最新的统计或IT技术留在切削刃上。你不只是出现并说,“我们正在前沿。”留在切割边缘是一种永不停止的追求,只要有新的技术和新数据,允许更加精致的经济视角。因此,您无法留下来 - 您必须不断学习新的方法,新技术,雇用新的IT专家和量子科学家,以便您始终始终始终始终是学术界和大型技术最新的发展。
据推测,你一直在寻找下一代人才?
Exactly, and the new generations constantly challenging the older ones to push in new directions.
So, where are we at now in the evolution of technology as it pertains to leveraging big data and alt data?
We’ve been through a big wave of complete virtualization, meaning that today you can go to a cloud or use a virtual machine and you don’t have to think all that much about hardware. That’s good for us because we don’t want to be hardware specialists – we want to consume the best hardware for any given task. We’ve been moving on from hardware to higher level software, and the next stack is standard tools, libraries, and languages that you can leverage. In the past, you had to develop a lot of your own libraries to implement statistical ideas and concepts. It was straightforward for mathematicians and statisticians. But you still needed to code it.
These days, there’s a high-quality stack of open source, open library software that will essentially implement the building blocks. That allows us to focus on higher level concepts, i.e. “What do I understand? And what do I want to model?” As time goes by, technology is becoming closer and closer to what you would write on the blackboard as a quant. Access to this easy hardware management and standard open-source, high-quality code allows us to focus on what makes our specialty and expertise critical for investors – namely, to build models and try to predict risk.
What types of specialized knowledge and expertise are required to pull all of this together on behalf of investors?
You have the people who design models by looking at data and inferring how it could be used to build an indicator of whether the price of an existing asset might go up or down. Financial markets are very efficient, so there is very little information buried in a huge amount of noise. A key competence is understanding the signal-to-noise ratio while being aware that there’s always a risk that what you are doing contains a lot of information which might bias your results. Some say there is more art to this element than science.
在量型中至关重要的第二种专业知识是贸易管理,这需要高技术思维。数据是强烈的,非常结构化的 - 价格和数量 - 您需要了解如何与市场互动,以及什么贸易执行最佳最大限度地减少价格影响和费用。
The third type of talent is the people combining all of this information through portfolio construction to optimize our strategies. They take the signals from the first group I described, and the trade cost information from the second group, and they mix that with their risk model and they try to build a portfolio which is optimal in terms of size, quality, and Sharpe ratio.
The end goal of all of what you’ve been describing is not only identifying the signals, but also identifying what has a material impact on price, yes?
您想要的是预测资本如何流动,如果您能够预测您将知道它将如何影响价格,您可以相应地定位您的投资组合。因此,从高级视图中,这是大目标。
Getting back to alt data for a moment – how are you incorporating NLP into what CFM does?
Natural language processing has been on our radar for nearly a decade but it has been a big push for us over the past three years, during which time there has been a massive evolution in the capacity of neural networks [a series of algorithms] to analyze human text and generate text like a human. We’ve been doing text analysis for a fairly long time using less advanced technology – looking at vocabulary, at syntax, at the effect of certain sentences. However, today, starting with generic off-the-shelf neural networks and leveraging our quant market expertise to thoroughly retrain them on the specific corpus of financial news, we can obtain results that are much better than what we used to obtain in a more manual way. It’s another example of leveraging cutting-edge technology and adapting it to the specific context – using our experience and training in how markets and portfolios behave allows us to implement that alt data from NLP in a very efficient way and improve our investment strategies.
What’s an example of how you might use NLP?
We’ve looked at earning calls from corporations, when CEOs and CFOs are giving quarterly updates on how the company is doing, what it is doing, and so on. You can do human analysis, for instance, and try to guess what it all means. Or you can use the massive power of a new generation of neural network which is able to capture more subtle and refined information in a way that is extremely difficult for a human to do.
For example, one idea that has been discussed in financial circles addresses the idea that when a CFO gives a quarterly update, over time he or she will develop a certain style of delivery and word usage, etc. But imagine that this company is in the midst of an M&A deal. Typically, during M&A, the lawyers will tell executives what they can and cannot say, how they should phrase things, and so on. The type of algos we’ve been discussing can capture slight changes in the way the CFO is talking about the company, this is the kind of thing that such algos are potentially able to capture. Now we can grasp these subtle differences as an advanced indicator thatsomethingis happening within this company. You can’t be surewhat正在发生,但它可能会提示您查看其他指标,例如价格动态或其他类型的新闻。
How do you see your use of NLP evolving?
I think it’s very promising for ESG. Typically, ESG data is self-reported and corporations like to have the greenest possible balance sheet. In truth, a lot of the green we see on balance sheets is sometimes more a reflection of the size of the marketing department than what the company is actually doing ESG-wise. That’s why it’s important to use external data sources instead of simply self-reported information.
在ESG中,博客,新闻,评论等有很多信息。That’s why there’s a real hope that by training the NLP system to look for specific carbon aspects, for example, or environmental and social aspects, you can get a more accurate real-time measure of the ESG quality of a company instead of relying purely on the self-reported balance sheet picture.
How do you see the use of alt data continuing to evolve?
This is a golden age for data emanating from all over the world. The big challenge currently is filtering all of it. One could possibly imagine that in the future we might hit a data lull where there is no new data or no new angle – a bit like an “AI winter.” That could be caused by economic recession or major changes in the world at large, or because technology development plateaus and new analysis of alt data slows. However, this all seems unlikely and right now there’s still plenty of information to keep us busy. One aspect which gives me great hope is that more and more public agencies are trying to collect and offer data to their citizens. Such open data initiatives allow us to aggregate and collect new information. There’s huge, untapped potential there because the details are messy – different standards, for example, and even within the same country each agency might have a different protocol and define things differently. But, if you’re able to collect this information, and rationalize and harmonize it, the potential is vast.
Data isn’t worth much if you don’t know how to optimize its use though, is it?
Right, the barriers to accessing data might be less severe than they once were, but knowing how to blend it with trading and portfolio construction still takes tremendous skill, knowledge, and experience. Simply having data doesn’t make you a data scientist any more than buying a piece marble makes you a sculptor. That’s where our nearly 30 years of experience as a quant firm and team come into play. Alt data is important and interesting, but it’s not the only way we can improve our models. Relying on quality price data and other financial data, and leveraging models and statistical tools to find deeper relationships within the data is just as important. We do both. We cover new data and existing data using the latest technology, and revisit and review what we’ve been doing with a critical eye. In this way we continue to improve and evolve what we do, and provide the service and products our clients are looking for.
Learn more about CFM and its strategies.
涉及投资的任何描述或信息t process or allocations is provided for illustration purposes only. There can be no assurance that these statements are or will prove to be accurate or complete in any way. All figures are unaudited. This article does not constitute an offer or solicitation to subscribe for any security or interest.