cloud network hunting note: recently, the Internet financial ZestFinance by domestic Internet financial professionals, the credit evaluation model based on big data is becoming more and more attention and follow. Based on the financial environment, brief introduction of ZestFinance, this article analyzes the background of big data inquiry, analyze the technology of data inquiry, objectively and comprehensively expounds the technology of data inquiry for China’s financial and credit reporting industry for the future development of the Internet. For those interested in the field of Internet financial entrepreneurs are well worth seeing, hunting cloud network share from the tsinghua financial review.
/Michael liu, shi jie,
ZestFinance, formerly known as ZestCash, is One of America’s emerging Internet financial company, founded in September 2009 in Los Angeles, by Internet giant Google (Google), a former director of information Douglas Merrill (Douglas Merrill) Capital One’s credit and financial institutions, senior director of Sean derk bodde (Shawn Budde) (ever management gains more than 1 billion dollars in subprime credit business) jointly established. ZestFinance r&d team is mainly composed of mathematicians and computer scientists, the early stage of the business mainly through ZestCash lending services platform, then focus on providing credit assessment services, aims to use big data to reshape the process of approving, is difficult to get the traditional financial services (Underbanked) individuals create available credit, lower their cost of borrowing.
ZestFinance at first is a traditional for Payday Loans (Payday Loans) to provide alternative products online. Payday loans for borrowers promise on payday reimbursement the name. Because the traditional credit risk evaluation system cannot cover all the people, about 15% of people for no credit rating and the bank excluded, unable to obtain the basic demand for credit. In addition to the traditional credit evaluation system can’t solve the problem of no credit rating of borrowing, ZestFinance also mainly take the traditional credit evaluation to solve the bad areas, will be the high borrowing costs for the low credit score as the service object, using big data technology to reduce their cost of credit. Compared with the traditional credit management business, ZestFinance processing efficiency increased by nearly 90%, risk control, ZestFinance model performance is improved by 40% compared with the traditional credit evaluation model.
ZestFinance is currently expanding to the credit risk management in other areas of deep, unveiled on February 2014 ZestFinance is based on the analysis of the large data collect debt rating (Collection Score), designed to provide auto financing, student loans, medical loans a new scoring system. ZestFinance future development direction is to put the advantage on the pay day loans continue to expand to other loans, including credit CARDS, auto loans, including home loans, even in the next 10 ~ 15 years, this method will replace the existing indicators, as the only evaluation criteria to apply for credit.
ZestFinance aroused the concern of the domestic began in July 2013, when the global third party payment platform PayPal co-founder Peter thayer, the famous American investors (Peter Thiel) led the ZestFinance $20 million in financing.
why big data evaluated the traditional credit evaluation service can’t cover all the crowd, especially vulnerable group
figure 1 shows the FICO score corresponding population distribution in the United States, the initial base of everyone’s score is 850 points, credit scoring model using the reference data from a number of consumer credit risk rating factor investigation, points from 850 points. Generally, the U.S. consumer credit score crowd two small head and big pattern in the distribution, credit scores in 750 ~ 850 people to have as much as 40%, including credit scores in 800 ~ 850 accounts for about 13% of the total, more than 25% of the total in 750 ~ 799, this is the whole social credit of the middle class, corresponding to the American middle class. Among them, the average American consumer FICO score is 678. From figure 1, you can see, there are a lot of people is far lower than the average of 678 points, such as FICO scores in 550 ~ 549 of 8%, in 500 ~ 549 accounted for 5%, 2% less than 499 points. According to the standard of FICO, if people cannot reimbursement, borrowing or lack of experience, they will automatically be considered risk, their loans will also be punitive to give higher interest rates. There is another possibility, that is, their loan application will be rejected, regardless of whether the islanders. For example, had a medical emergency, or recently immigrated to the United States. Table 1 according to FICO score service groups can be divided into four interval, and corresponding to different financial services. Not complete or incomplete credit records of individual consumers, on the basis of traditional credit assessment system (FICO score), it is often difficult to be covered with traditional financial services, even in the United States developed financial system cannot access to regular financial services, or have to pay a big price to receive regular financial services.
traditional credit evaluation model information dimension compare single
the basic ideas of traditional FICO score model is a borrower’s credit history data with a database of all the borrower’s credit habits, to regularly check the development trend of the borrowers default, casual overdrafts, even filed for bankruptcy and other financially troubled borrowers development trend is similar. As shown in figure 2, it mainly inspects the user credit qualification from five aspects. But with the further development of the credit business, FICO credit score due to the threshold of the single standard, strict and one-sided evaluation results and ridicule.
although the traditional credit evaluation model for credit risk management has played a large role in the process, such as used to promote the rapid development of the American mortgage market. But under the background of big data consumers appear many information dimensions, such as electronic commerce, social networks and search behavior, the traditional credit evaluation model is the ability to solve the problem is more and more limited.
traditional credit evaluation model is relatively lag time
although FICO score reflects the risk still sorting, but its ability to predict absolute risk and performance of the financial crisis in 2008, figure 3 shows, FICO score from 2005 to 2011 in the United States basically do not have big change of the population distribution, and the 2008 financial crisis and then there are a lot of bad of serious reality.
as a result of the traditional credit evaluation model based on FICO score on the narrow coverage, the information dimension of a single, the time lag, so, in the era of big data, the need to explore new ideas of credit evaluation. Abroad three credit bureaus and FICO how companies have already started using big data technology to improve the traditional credit evaluation system of prospective studies, such as Experian (Experian) into the team focus on social network data influence on credit scores, FICO, began many years ago the information online assessment tools and credit evaluation system based on Internet research projects.
ZestFinance big data of credit evaluation practice
ZestFinance’s basic idea is to assume that all data is linked to the credit, in the available data mining credit information as much as possible. ZestFinance applications of big data technology mainly from the big data collection and data analysis, two levels of excavated credit lack credit histories.
big data acquisition technology
ZestFinance multi-source data based on the technology of data acquisition, on the one hand, inherited the decision variables of the traditional credit system, attaches great importance to the depth of mining credit credit history of objects. Will be able to affect the user, on the other hand, the credit level of other factors are also taken into consideration, such as network information, the user application information, so as to realize the fusion of the depth and breadth.
ZestFinance source of data is very rich, is dependent on the structured data as well as import a large amount of unstructured data. In addition, it also includes a lot of non-traditional data, such as the borrower to pay the rent record, pawn shops, network data information, etc., and even to the borrower form the habit of use case, submit an application online before reading the text of edge information, such as the factor of credit evaluation. Similarly, irregular data is a sensor of the objective world and reflect the real state of borrowers, the client’s real social network mapping. Only longxi borrower the clues and the correlation between behind the behavior, can provide depth and effective data analysis service, reduce the loan default rates.
as shown in figure 4, the data source of ZestFinance diversity is reflected in: first, the credit evaluation is the most important data for ZestFinance or by purchasing or exchange of data from a third party, contains both Banks and credit card data, also includes the legal records, non-traditional data such as number of moving.
again is the network data, such as IP address, browser version, and even computer screen resolution, the data can be dug up the user’s location information, character and behavior characteristics, is conducive to assess credit risk. In addition social network data are also big important sources of data inquiry.
in the end, asking the user directly. In order to prove their ability to repay, the user will have a detailed and accurate answer, other users can also submit the relevant certificate of public records, such as water electric bills, phone bills, etc.
multi-dimensional reporting big data can make ZestFinance can not rely solely on traditional credit system, for the individual consumer is described from different angles and further in-depth quantitative credit assessment.
big data analysis model of
figure 5 shows the ZestFinance credit evaluation analysis principle, the fusion of multi-source information, using the advanced machine learning model and integration of learning strategies, for data mining. First of all, thousands of species from a third party (such as phone bills and rental history, etc.) and borrowers of the original data will be input system. Second, finding the correlation between data and the data conversion. Again, will be the basis of the correlation between variables to integrate into the larger measurement, each of these variables that reflect the characteristics of a particular aspect of the borrowers, such as the probability of fraud, long-term and short-term credit risk and solvency, etc. Then the larger variable input into different data analysis model. Finally, the conclusion of the each model output in accordance with the model of voting principle, form the final credit scores.
among them, 10 ZestFinance developed based on the analysis model of machine learning, more than 10000 data of each credit applicant information is analyzed, and it is concluded that more than 70000 on its behavior to make the measurement indicators, can complete within five seconds. The 10 model to vote in the following way: the smartest 10 friends let you sit at a table, and then ask their opinion of a certain thing. The mechanism of decision-making performance is far better than the industry average.
in recent years, the credit risk evaluation framework based on the large data (credit evaluation method is far from being known as the mainstream) by many financial institutions use the Internet at home and abroad, such as the Kabbage Kreditech in Germany, the United States, and the domestic recent IDG company for A first round 40 million yuan investment of silver (Wecash) and so on, the impact to the traditional credit system formed.
as shown in table 2, this will be based on the technology of data credit evaluation system and the traditional credit assessment (in the credit system of the United States, for example), compared to find the main difference has the following several aspects.