Thursday, July 21, 2016

Learning how to apply to the financial sector

Learning how to apply to the financial sector?

P2P company last year after a cold, in transition says it will use the data, machine learning technology services such as financial, transformed into modern financial technology companies in Fintech. But it wasn't easy to machines instead of people to handle the data to judge, in the domestic financial sector has just started.

Contact CreditX krypton is Lei feng, a used machine learning to do venture capital company, in the financial sector has a lot of practical experience and reflection. Krypton founder Zhu Mingjie Randy China Summit recently gave a speech talking about applying machine learning to financial difficulties, as well as how to improve the interpretability of the model. Below, edit content, delete.

I made more than 10 years of machine learning, with machines instead of people to deal with data to make decisions and judgments. Machine learning over the past more than 10 years, successful application is available on the Internet, search, advertising, testimonials, can said Internet first to achieve data age. Financial innovation in this area, how to achieve the level of machine learning and artificial intelligence of the Internet, everyone just started today, I would like to talk about our CreditX machine learning practices in the financial arena of the Internet level experience and reflection.

Financial risk management pain points Guerlain iPhone 5 Case

I've been thinking, "scientific and technological progress is forced out by business requirements." We rely on algorithms in the Internet industry and machinery, are forced, why, because the amount of data is so large, you want to go to Taobao search phone case, Ali's schoolmates human flesh from the hundreds of millions of products to help you figure out like most appropriate, that's out of the question. Traditional financial scene, a loan of 1 million of the wind people and relationships, it is possible and to the Bank's credit card Center, backlog audited, weekly overtime approval personnel, are endless.

Now more inclusive Internet facing financial scenarios, such as hundreds of dollars a phone credit, shop by manpower must be impracticable. Therefore, this issue is not merely improving operational efficiency, but had to be put down to the machine, allow machines to learn risk experience, robot into the wind-control expert.

Application of machine learning and artificial intelligence in the financial sector difficulties Guerlain iPhone 5 Case

The first problem is that too few data. Because financial data is very sparse, and many forms of financial products that did not happen today, not more than 10 years of accumulated data. In other words, the lack of training data, which is also known as a cold boot, missing data. In addition, have turned sour in the financial sector situation at least a few months more than a month, accumulating data have to wait for a long time, by contrast, can quickly get Internet search click on the feedback, the difference is great. Data loss is a huge barrier to impede a machine to study the human experience.

The second is too much data. Here refers to the data dimensionality more than people can handle. Characteristics of traditional financial only more than 10 dimensions variable, manual adjustment formula to deal with. But is now facing such a multidimensional data, we think a lot of good vision, and discusses a lot of data can be used. But why not? Question is is there any way we can have a very strong ability to express these very primitive, and can also be called a weak variable data to use. Weak data are combined, linked with the result, one visual experience can be understood, risk expert feedback.

On the financial scene, like the Internet as machine learning is not a black box, throw a bunch of data, results feedback from iteration. Financial scene, with particular emphasis on the interpretability of the model, in order to spread the risk experience and intuitive data correlate performance results. On this basis, we can say that the experience of people involved to use machine learning data modeling operations. Feature to be able to go back home, especially in the financial report has to wait a long time, need someone capable of rapid intervention feedback.

How to deal with financial risk control cold start problems

Too few data

Too little and too slow for the data problems, cold startup issue is a typical case. We often face the problem of missing data in the Internet industry, has accumulated a mature experience, was added to the human factors in machine learning. When we do search ads, they invite people to label data, then the annotation data's experts to guide the tuning algorithm algorithm engineer, improved sorting results. On the financial scene, there are many ready-made experience and our experienced risk management personnel, these experts have a strong risk control knowledge.

In theory, if there are hundreds of risk management experts, without wages, we do cell phone credit can do so, but the reality is that we have to rely on machines to learn that they controlled experience. So our semi-supervised learning approach, risk control business experts and actual results in online learning, doing a combination of credit. In this process, the risk officer can intervene in real time, constantly make some adjustments based on output, and real time feedback to the model training iterations of upgrade process.

This says that we pay special attention to the human factor. Everyone is now talking about artificial intelligence, artificial intelligence is the essence of what? My understanding is to get experience for machine learning. We used to rely on a few experienced risk management staff, now we can make machines learn human experience, and let the machines do an automatic decision.

Results of financial operations and sample very precious. For example, I have some samples in the mortgage business, before, then change to a new consumer credit business, or switch from a consumer credit business to another new business. These precious sample data cannot be lost, but how to use it? We can make use of their experience and knowledge as much as possible, to separate the generic risks of core models, and domain knowledge, according to business information, as well as scenes of combining prior knowledge, learning and reuse across fields on the basis of this knowledge across the scene, and the accumulation of knowledge.

Depth too much difficulty to learn engineering technology characteristics data

Next, we'll look at "too much data". I put this question in two parts.

First is the dimensionality of the data a lot. All we care about is how the data and financial risk linked to the issue of, there really needs to be very powerful feature processing and presentation skills. This is traditional linear regression modeling method is hard to do. Our approach has a lot of, this included all have a high heat now "deep learning". Deep learning by learning of data characteristics is the essence of human knowledge and the way that data. In order to solve the problem of too much data, so that people can see through the vastness of the raw data, in front of the model, we tried different depth feature coding methods, unsupervised learning methods to preprocess raw data, in order to achieve the dimensionality of the feature, the vastness of the raw data and the final result is linked to.

Model interpretability

Followed by model interpretability. Model interpretability of particular interest to financial experts. This has two meanings:

If credit scoring results, if not explained, it's hard and the applicant to communicate;

In addition, we are facing a very complex environment, if the risk result is still a black box black box, risks are difficult to control and estimation.

If the model is wrong, we can't afford the risk caused. In the context of rapid growth of the Internet financial services, company's business most likely did not go down. Therefore, Internet black boxes into a black box will not apply to the financial scene, need explanation of a local model to do so. Our experience is, using LIME to capture key variables from the results or partial results and control experts were quick to catch the wind is what lead to changes in the results.

Krypton effect

Guerlain iPhone 5 Case

We use the technology of the Internet experience, do a difficult attempt in the financial scene, and got some practical experience, including from the very beginning the data acquisition process, participation of people involved, intervention process of complex models, resulting eventually in our practice.

From efficiency, said, one of our partners to get a good result. They made a financial credit scenario, deployment run on the system and model of krypton, just 3-4 a business risk and operations personnel, risk to do most of the work to the machine.

In effect, we use DNN model to a result, you can see the results than traditional LR model KS from 0.19 to 0.43. Model of the human figure and the result is that we do the most direct answer, there is no concept of what can speak.

All data prior to high expectations and repeated disappointment, for data technology, now is a good time. Because we really need to be able to have the ability to use data, using machines to solve financial problems, this is the opportunity of our time, and air, but also a new beginning.

Next month, 12, 13th, Lei feng's network (search for "Lei feng's network", public interest) in Shenzhen will hold an unprecedented summit of artificial intelligence and robots, we will publish "artificial intelligence and robot Top25 list of innovative enterprises" list, we collect and confirm that AI, robotics, autonomous, unmanned, and several high quality projects in the area. If your project is related to the field, and technical barriers, sufficient enough to have growth potential, please contact 2020@leiphone.com.

No comments:

Post a Comment