Data analysis and forecasting. Data analysis and forecasting Data analysis and forecasting in 1s enterprise

Start and end of business processes

The life cycle of a business process begins with its start. At this route point, you can define an event handler Before Start. This procedure has two parameters. The first parameter is the route point from which the call to the handler occurred (a business process can have several start points), the second parameter is Failure. When writing to the Failure variable, the value True will refuse to start the business process. In the event handler Before starting, you can check the conditions necessary to start the business process, create “related” objects, the links to which must be stored in the business process itself. When defining a handler for this event, it is not recommended to implement mechanisms that organize a dialogue with the user (opening various dialog forms).

The start of a business process itself can be done in different ways:

software start of a business process (from code in an embedded language);

interactive start (clicking on the OK button of the business process form);

start of a business process as an embedded one.

Using data analysis and forecasting mechanism in 1C

The mechanism of data analysis and forecasting allows implementing various tools in applied solutions to identify patterns that are usually hidden behind large volumes of information.

The mechanism allows you to work with both data received from the infobase and data received from another source, pre-loaded into a value table or spreadsheet document. Applying one type of analysis to the source data, you can get the result of the analysis. The result of the analysis is a kind of data behavior model. The result of the analysis can be displayed in the final document or saved for future use.

Further use of the analysis result is that based on it, a forecast model can be created that allows predicting the behavior of new data in accordance with the existing model. For example, you can analyze which products are purchased together (in one invoice), and save the forecast model based on this analysis in the database.

Using text document layouts

Text document 1C: Enterprise allows you to present various information in the form of texts. A text document can be read from a text file, saved to a text file. It can be placed in a form or in a layout; work with it is possible by means of the built-in language. By and large, a text document allows you to perform three logical groups of actions: - reading from disk and writing text files to disk; - work with individual lines of a text document: receiving, adding, deleting, replacing; - creating a text layout and using it to form the resulting text document.

In addition to the direct formation of the contents of a text document, it is possible to fill out text documents based on layouts. The layout of a text document describes the immutable parts of a text document containing the layout, and the fields into which data can be added. The process of filling out a text document based on the layout consists in reading certain areas of the layout, cyclically filling them with data and sequentially outputting the resulting parts of the document to the resulting text document.

Layout format for a text document. The layout of a text document is a text document that uses service lines starting with the symbol "#". After the control character is followed by keywords that describe certain elements of the layout.

Also in the layout of a text document, the service characters “[" and "]" are used, which determine the location of the layout fields to be changed.

The entire layout of a text document consists of areas. One area combines several consecutive rows. Areas must follow each other and cannot overlap or be included in each other. To describe the area, the keywords Region and End of the Region are used. After the keyword Region, the name of the region is indicated.

The mechanism is represented by a set of objects of the embedded language 1C: Enterprise. The interaction scheme of the main objects of the mechanism is shown in the figure. Setting up data analysis columns - a set of settings for input data analysis columns. For each column, the type of data contained in it, the role performed by the column, and additional settings depending on the type of analysis performed are indicated. Data analysis parameters - a set of parameters for data analysis. The composition of the parameters depends on the type of analysis. For example, for cluster analysis, the number of clusters into which the original objects must be divided, the type of measurement of the distance between objects, etc., is indicated. Source data is a data source for analysis. The result of the query, the cell area of \u200b\u200bthe spreadsheet document, and the table of values \u200b\u200bcan act as a data source. An analyzer is an object that directly performs data analysis. The data source is set to the object, parameters are set. The result of this object is the result of data analysis, the type of which depends on the type of analysis. The result of data analysis is a special object containing information about the result of the analysis. Each type of analysis has its own result. For example, the result of data analysis - the decision tree will be an object of the ResultAnalysisDataTreeSolution tree type. In the future, the result can be displayed in a spreadsheet document using the data analysis report builder, can be displayed through programmatic access to its contents, and can be used to create a forecast model. Any data analysis result can be saved for later use. The forecast model is a special object that allows you to make a forecast based on input data. The type of model depends on the type of data analysis. For example, a model created for data analysis - the search for associations will be of the ModelPredictionSearchAssociation model type. The forecast data source is passed to the input of the forecast model. The result is a value table containing predicted values. A selection for a forecast is a table of values, a query result, or an area of \u200b\u200ba spreadsheet document that contains information on which to forecast. For example, for the forecast model — search for associations, the selection may contain a list of products of the sales document. The result of the model’s work can recommend what products can still be offered to the buyer. Customization of sample columns - a set of special objects showing the correspondence between the columns of the forecast model and the columns of the forecast sample. Setting columns for the result - allows you to control which columns will be placed in the resulting table of the forecast model. The result of the model is a table of values \u200b\u200bconsisting of columns, as indicated in the settings of the resulting columns and containing the predicted data. The specific content is determined by the type of analysis. Data analysis report builder - an object that allows you to display a report on the result of data analysis. In addition, the report builder provides a special object for communicating with the data in order to allow the user to interactively manage the analysis parameters, the columns of the data source, the columns of the forecast model, etc. Types of analysis The mechanism allows you to perform the following types of analysis:

general Statistics
Association Search
Sequence Search
Decision tree
Cluster analysis

The data analysis mechanism in 1C 8.2 and 8.3 simplifies the work of the developer in terms of identifying patterns based on various data. For example, using this mechanism, you can display the products that are most often bought together. Another example is the construction of a sales forecast based on historical data. This is not the whole spectrum of application of the data analysis mechanism in 1C, we will delve deeper into its capabilities in more detail. The main objects of the data analysis mechanism in 1C This mechanism is represented in the 1C Enterprise system by 3 objects of the system:

Data analysis - an object performing data analysis. For it, you must specify the data source and the necessary parameters for analysis.
The result of data analysis is an object that is the result of the work of data analysis.
Forecast model - created based on the result of data analysis. The object is the final link in the 1C analysis mechanism and generates a table of values \u200b\u200bthat contains the predicted values.

Types of data analysis 1C 8.3 System 1C The enterprise can use different types of analysis, consider them in more detail.

General statistics - this type of analysis is a simple statistical selection from a data source. An example of application is the analysis of sales by stock for a period. The result of the analysis will be information on how much this or that product was sold. The system also calculates specific fields - maximum, minimum, median, average, range, standard deviation, number of values, number of unique values, mode.
Search associations - a type of analysis designed to search for frequently occurring combinations. Very well suited to locate frequently purchased items together. As a result of the analysis, the system will generate the following information: information about the processed data, associative groups, associative rules by which groups are associated.
Sequence search - analysis that allows to identify patterns in the analyzed data and offer further forecast. As a result of the analysis, the system will display information about the possibility of the occurrence of certain events in percentage terms.

One of the main trends in the market of accounting and management systems is the constant increase in demand for the use of analytical data processing tools that provide informed decision-making. That is why one of the strategic directions of the development of the 1C: Enterprise software system has been the constant expansion of the possibilities of economic and analytical reporting. However, today, customers no longer have the traditional tools to generate a variety of reports, pivot tables and charts that are created on the basis of predefined indicators and relationships and which need to be analyzed manually. Increasingly, enterprises need qualitatively different tools to automatically search for non-obvious rules and identify unknown patterns (Fig. 1). This is how it is possible to generate qualitatively new knowledge on the basis of information accumulated by a company and sometimes make completely non-trivial decisions to improve business efficiency using data mining methods (IAD).
Fig. 1. The logic of development of "intelligence" of the solved analytical problems. The release in the summer of 2003 of a new version of the 1C: Enterprise 8.0 technology platform made it possible to significantly expand the capabilities of business analytics in the system (see sidebar). However, one important point to make here. Platform software "1C" is developed not only by "steps", from version to version, but is constantly being improved and expanded within one version, and in two directions - technological and applied. So, after the first announcement of the G8, more than a dozen releases of the platform have already been released, the latest version (as of January 2006) has the number 8.0.13, and it is very different from what it was two and a half years ago! One of the directions of development of "1C: Enterprise 8.0" is precisely the mechanisms of business intelligence; in particular, IAD tools appeared in it only in 2005. It is important to note that most of the analysis functions are implemented at the technological platform level and become available to users only after application solutions are included in new releases. Thus, there is a certain gap (sometimes several months) between the emergence of new features and their provision to users. Bearing this in mind, in order to narrow the gap, 1C released in September 2005 a special application solution, Data Analysis Subsystem (PAD), which can be integrated into any configuration of the 1C: Enterprise 8.0 platform. In addition to a wide range of basic functions, the package includes more than 30 pre-configured models for the typical configuration of "Trade Management". PAD includes those qualitatively new IAD tools that were previously absent in the 1C programs. For the direct analysis and forecasting of data, specific skills and knowledge are not required. It assumes a good command of the analyzed subject area and an understanding of the main cause-effect relationships in it. To prepare data sources and forecast models, the ability to use the query builder and knowledge of the principles of placing information in configuration metadata objects are required. IAD algorithms included in the new configuration (version 1.0.5) form analytical models (templates) that describe the patterns in the source data. These models are of independent value (can be reused), and are also used for the automated generation of forecasts, including scenario forecasts, with previously unknown indicators (Fig. 2). The IAD mechanism is a set of interacting objects of the embedded language, so the developer can use its components in an arbitrary combination in any application solution. Built-in objects make it easy to organize interactive tuning of analysis parameters by the user, as well as display the analysis result in a convenient form for displaying in a spreadsheet document. Applying one of the types of analysis to the source data, you can get a result that will be a certain model of data behavior. The analysis result can be displayed in the final document or saved for future use - based on it, you can create a forecast model that allows you to predict the behavior of new data.
Fig. 2. The general scheme of the functioning of the mechanism of data mining. The current version of the subsystem implements the methods that have received the greatest commercial distribution in world practice, namely:

clustering - implements a grouping of objects, maximizing intragroup similarity and intergroup differences;
decision tree - provides the construction of a causal hierarchy of conditions leading to specific decisions;
association search - searches for stable combinations of elements in events or objects.

Below we consider in more detail the essence and possibilities of the practical application of these methods of IAD.

Clustering

The purpose of clustering is to isolate from a set of objects of the same nature a certain number of relatively homogeneous groups (segments or clusters). Objects are divided into groups so that intragroup differences are minimal, and intergroup differences are maximal (Fig. 3). Clustering methods allow you to switch from object-by-group to group representation of a collection of arbitrary objects, which greatly simplifies their handling. Several possible scenarios for using clustering in practice are described below. Customer segmentation according to a certain set of parameters, it makes it possible to distinguish among them stable groups that have similar customer preferences, levels of sales and solvency, which greatly simplifies customer relationship management. At product classification very often quite conventional classification principles are used. The selection of segments based on a group of formal criteria allows us to define truly homogeneous groups of goods. In the context of a wide and rather diverse product range, assortment management at the segment level compared to management at the nomenclature level significantly increases the effectiveness of promotion, pricing, merchandising, and supply chain management. Managers Segmentation allows you to more effectively plan organizational changes, improve motivation schemes, adjust the requirements for hired personnel, which ultimately allows you to increase the manageability of the company and the stability of the business as a whole.
Fig. 3. Data analysis by clustering. The similarity and difference between objects is determined by the "distance" between them in the space of factors. The method of measuring distance depends on a metric that indicates the principle of determining the similarity / difference between sample objects. The current implementation supports the following metrics:

"Euclidean metric" is the standard distance between two points in an N-dimensional Euclidean attribute space;
"Euclidean metric squared" - enhances the effect of differences (distances) on the result of clustering;
"city metric" - reduces the impact of emissions;
"dominance metric" - defines the difference between the objects of the sample as the maximum of the existing difference between the values \u200b\u200bof their attributes, therefore it is useful for enhancing the differences between objects by one attribute.

The method of cluster formation based on information about the distance between clustered objects is determined by the clustering method. The current version of "1C: Enterprise 8.0" implements the following clustering methods:

"close communication" - the object joins the group for which the distance to the nearest object is minimal;
"long-distance communication" - the object joins the group for which the distance to the farthest object is minimal;
"center of gravity" - the object joins the group for which the distance to the center of the cluster is minimal;
k-means method - arbitrary objects are selected that are considered the centers of clusters, then all analyzed objects are sequentially sorted and attached to the cluster closest to them. After joining the object, a new cluster center is calculated, which is calculated as the average value of the attributes of all objects in the cluster. The procedure is repeated until the centers of the clusters change.

Any of the clustering methods implemented in the platform requires an explicit indication of the number of clusters sought. You can enter weights for the attributes of objects, which allows you to prioritize between them. As a result of the analysis using clustering, the following data are obtained:

cluster centers, which are a set of averaged values \u200b\u200bof input columns in each cluster;
a table of intercluster distances (distances between the centers of clusters), which determine the degree of difference between them;
forecast column values \u200b\u200bfor each cluster;
ranking of factors and a tree of conditions that determined the distribution of objects into clusters.

Clustering algorithms allow not only to perform cluster analysis of objects on the set of specified attributes, but also to predict the value of one or more of them for the current sample based on assigning objects of this sample to a particular cluster.

Association Search

This method is designed to identify persistent combinations of elements in certain events or objects. Analysis results are presented as groups of associated elements. Here, in addition to the identified stable combinations of elements, detailed analytics on associated elements is given (Fig. 4).
Fig. 4. Presentation of the results of the analysis by the method of "search for associations" in the form of groups of associated elements. Initially, the method was developed to search for typical combinations of goods in purchases, therefore it is sometimes called the analysis of the shopping basket. In relation to this scenario, as a rule, commodity groups or individual goods act as associated elements. A grouping object, combining the elements of the samples, can be any object of the information system that identifies the transaction: for example, a customer’s order, an act on the provision of services or a cash receipt. Information on patterns in customers' product preferences increases the effectiveness of customer relationship management (in terms of advertising campaigns and marketing campaigns), pricing (the formation of complex offers and discounts), inventory management and merchandising (distribution of goods in trading floors). Another example of the use of this method is the determination of preferred combinations of advertising channels by customers in order to avoid their duplication during targeted advertising campaigns. This can significantly reduce the costs of such events. The association search algorithm implemented in the platform has rather flexible means of controlling the adequacy of analysis or forecast models. The parameter "Minimum percentage of cases" defines the "threshold" of the algorithm for one or another combination of elements in an event or object, which allows not taking into account poorly distributed associations. The parameter "Minimum certainty" determines the required stability of the desired associations, and the parameter "Minimum significance" allows you to identify the most priority ones. Significantly facilitates the perception of the results of the analysis and forecast, the parameter "Type of clipping rules", which can take the values \u200b\u200b"Cut off excess" and "Cut off covered by other rules." For a practical interpretation of the results obtained using this algorithm, it is critically important to split the initial set of associated elements into groups that are really homogeneous from the point of view of the analysis.

Decision tree

As a result of applying this method to the source data, a hierarchical (tree-like) structure of the rules of the form "if ... then ..." is created, and the analysis algorithm ensures the isolation of the most significant conditions and transitions between them at each stage. This algorithm is most widely used in identifying cause-effect relationships in data and in describing behavioral models. A typical area of \u200b\u200bapplication of decision trees is an assessment of various risks, for example, closing an order by a client or moving to a competitor, late delivery of goods by a supplier or delay in payment of a commodity loan (Fig. 5). The typical input factors of the model are the amount and composition of the order, the current balance of mutual settlements, the credit limit, the percentage of prepayment, delivery conditions and other parameters characterizing the forecasted object. Adequate risk assessment ensures informed decisions to optimize the profitability / risk ratio in the company’s activities, and is also useful for increasing the realism of various budgets.

Fig. 5. The application of the decision tree method allows, based on the input factors of the model (a), to obtain an assessment of the risks of making certain managerial decisions (b). As an example, illustrating the ability of the algorithm to identify causal relationships, we can cite the task of optimizing the work of the sales department. To solve it, as the predicted value, we will choose the indicator of the effectiveness of sales managers, for example, specific profitability per client, and as factors, a combination of data that potentially affects the result. The algorithm will determine the factors that have the greatest impact on the result, as well as typical combinations of conditions leading to a particular result. Moreover, the “Data Analysis” subsystem will allow you to evaluate (predict) the expected values \u200b\u200bof the target indicator based on relevant data, as well as to make a forecast “what if ...” by changing the indicators supplied to the input of the model. The results of analysis and forecasting using decision trees can significantly reduce the impact of the uncertainty of the business environment on the state of the company, as well as solve a wide range of problems associated with identifying complex and unobvious causal relationships. The decision tree algorithm forms a causal hierarchy of conditions that leads to specific decisions. As a result of applying this method to the training set, a hierarchical (tree-like) structure of the rules for splitting the form "if ... then ..." is created. The analysis algorithm (model training) is reduced to an iterative process of isolating the most significant conditions and transitions between them. Conditions can have both quantitative and qualitative character and form the "branches" of this abstract tree. Its “foliage” is formed by the values \u200b\u200bof the predicted attribute (decision), which, like the transition conditions, allow both qualitative and quantitative interpretation. The combination of these conditions imposed on the factors and the structure of the transitions between them to the final solution form the forecast model. This algorithm is most widely used in assessing the outcomes of various event chains and identifying cause-effect relationships in samples. The significance and reliability of the model of this algorithm are controlled using the parameters “Simplification Type”, “Maximum Tree Depth” and “Minimum Number of Elements in a Node”. The results of the analysis of the sample using the decision tree algorithm are:

factor rating, which is a list of factors that influenced the decision, sorted in descending order of importance (“citation” in tree nodes);
a comparison of the decisions (values \u200b\u200bof the forecast column) and the conditions that determined them, in other words, the tree “Effect-Reason”;
the Cause-Effect tree, which is a set of transitions between conditions that defines a particular solution (in fact, a visual representation of the forecast model).

Joint solutions "1C"

In addition to the functions implemented directly within the framework of the 1C: Enterprise 8.0 platform, the 1C business analytics tools are replenished with specialized solutions created including within the framework of the 1C-Joint project (http://v8.1c.ru/ solutions) - with the participation of partners of the company and independent developers (see "Joint solutions of 1C and its partners," BYTE / Russia "No. 9" 2005). Here we note two products related to the use of intelligent analysis methods, - it is "1C: Enterprise 8.0. 1C-VIP Anatekh: ABIS. ABC. Management accounting and costing "(development partner - consulting company" VIP Anatekh ") and" 1C-VIP Anatekh-VDGB: ABIS. BSC. Balanced Scorecard "(development partners - VIP Anatech and VDGB).

Typical Business Scenarios for Using IAD Methods

There is a section in the PAD documentation devoted to typical examples of the application of data mining in relation to the configuration "1C: Trade Management 8.0.". Here are some of these business scenarios.

Customer relationship management

Ad Planning ScenarioPlanning for an upcoming advertising campaign is considered from the point of view of optimizing the distribution of the allocated budget for advertising channels, based on regional, product, customer and other indicators of the target segment, as well as the effectiveness of advertising channels in these sections in a certain previous planning period. Algorithm - "Cluster analysis." Forecast Attributes - the proportion of responses to the advertising channel of conditionally homogeneous segments highlighted by the algorithm. Computed Columns - the share of advertising channels in the budget of the advertising campaign, taking into account the likely share of responses and effectiveness (in the sense of the resulting revenue) of each advertising channel. An example of a pattern: Class A customers of Region P who prefer product group P are attracted by the same advertising channel as customers of Region H who prefer product group U.

Supply chain management

Scenario "Optimizing the selection of suppliers by product group"The selection of dominant first-line suppliers for key product groups is extremely important to stabilize the logistics system in particular and the overall supply chain management system as a whole, as well as to reduce the average duration of supply chains. On the other hand, closer integration with major suppliers, as a rule, can significantly reduce the cost of goods. In this regard, it is of interest to analyze stable combinations of suppliers in different product groups in comparison with analytics for suppliers associated within groups. This allows you to identify the "intersection" of suppliers in various product groups and optimize relationships with them. Algorithm - "Search for associations." Forecast Attributes - sustainable combinations of suppliers. Key factors - product groups. Decryption - analytics by suppliers (volume of purchases, revenue, terms of delivery and payment, lead times - pessimistic, optimistic, average). An example of a pattern: a stable association of a large and unpredictable supplier A and a predictable average supplier B in a large number of product groups. It is possible, when placing orders for competitive product groups, to position the average supplier as the main one, if the large order volume does not exceed a certain threshold (which gives a significant gain on the scale).

Personnel Management

Scenario "Profiling sales managers by key performance indicators"Determining the effectiveness of managers (retention, customer search, communication efficiency, collection of conditional and unconditional receivables, specific performance indicators for a client, etc.) is of interest not only from the point of view of creating a system of material incentives for managers, but also from the point of view of effective rationing parameters of their activities. Algorithm - "Decision trees." Forecast Attributes - key performance indicators of the sales department (number of key customers, outflow and attraction ratios, lost revenue per month, attracted income per month, monthly income from a client, total revenue from customers, etc.). Key factors - the number of active customers, revenue, income, specific indicators per client, communication efficiency. Depending on the forecast attributes, the composition of factors can vary significantly. An example of a pattern: Managers who provide the best collection indicators for receivables (the ratio of receipts of DS to revenue) have a retention rate\u003e 0.8; attraction ratio\u003e 0.25; the number of simultaneously open transactions no more than 15, but no less than 10; the intensity of events per day is not more than 10, but not less than 3; the number of active customers in a period of at least 50, but not more than 100.

Conclusion

Modern business is so multifaceted that the factors that potentially influence a particular decision can be in the tens. Competition intensifies day by day, the product life cycle is shortened, customer preferences are changing faster. For business development, it is necessary to react as dynamically as possible to a rapidly changing business environment, given the subtle, and sometimes elusive, patterns of events. Which groups of customers will respond to the promotion, and which will irrevocably go to competitors? Open a new business line or wait a while? Will the buyer delay the payment and the supplier the shipment? What are the growth opportunities and where are the potential threats? Thousands of managers ask themselves and colleagues such questions daily. The data analysis subsystem implemented in the 1C: Enterprise 8.0 platform is designed to help users of the corporate information system find answers to non-trivial questions faster, providing automated conversion of data accumulated in the information system into useful patterns that are well interpreted.

Economic and analytical reporting in "1C: Enterprise 8.0"

The "1C: Enterprise 8.0" platform includes a number of mechanisms for the formation of economic and analytical reporting, which allow generating interactive documents (and not just printed forms) within the framework of various applied solutions. Thus, the user can work with reports in the same way as with any screen form, including changing report parameters, rearranging it, using “decryption” (receiving additional reports based on individual elements of an already generated report), etc. In addition, There are several universal software tools that allow you to generate any arbitrary reports, depending on the tasks. This can be done by users themselves (quite experienced), who are well acquainted with the structure of the applied solution. Below we briefly consider the main means of preparing reports in "1C: Enterprise 8.0". Inquiries - this is one of the ways to access data in "1C: Enterprise 8.0", with which information is extracted from the database under certain conditions, usually in combination with the simplest processing of the received data: group, sort, calculate. Changing data using queries is not possible, since they were originally designed to quickly obtain information from large arrays of information. The database is implemented as a set of interconnected tables, which can be accessed either individually or in several tables in conjunction. To implement their own algorithms, the developer can use a query language based on SQL and containing many extensions that reflect the specifics of financial and economic problems and reduce the effort spent on creating applied solutions. The platform includes a query designer, which allows you to compose the correct query text using only visual means (Fig. 6).

Fig. 6. The query designer (a) allows the developer to compose the query text (b) exclusively by visual means. Spreadsheet document It is a powerful mechanism for visualizing and editing information, including using dynamic reading of information from a database. A spreadsheet document can be used on its own or be part of any of the forms used in the application solution. At its core, it resembles spreadsheets (it consists of rows and columns in which data is placed), but its capabilities are much wider. It supports the execution of grouping, decryption, and inclusion of notes. In the document, you can apply various types of report design, including graphical charts. A spreadsheet document may contain pivot tables, which themselves serve as an effective tool for programmatically and interactively presenting multidimensional data. Output Form Constructor helps the developer to create reports and present report data in a convenient tabular or graphical form. It includes all the features of the query designer, as well as creating and customizing the form. Report builder - This is an object of the built-in language, which provides the ability to dynamically create a report both programmatically and interactively (Fig. 7). The basis of his work is a request, according to which the user is given the opportunity to interactively configure all the main parameters contained in the request text. The results of this query are displayed in a spreadsheet document, which can also use information from arbitrary data sources. Using the builder of the report, the developer can change the composition of the parameters available to the user for customization.
Fig. 7. Scheme of the report builder. Geographic Schemes allow you to visualize information that has a territorial reference: to countries, regions, cities. Data on them can be displayed in various ways: in the form of text, histograms, color, picture, circles of various diameters and colors, pie charts. This allows you to display, for example, sales by region in graphical form. The user can change the scale of the displayed scheme, receive decryption by clicking on the objects of the scheme, and even create new geographical schemes. A geographic map can also be used simply to display certain geographic data, for example, directions to an office or vehicle route. Data mining. These mechanisms make it possible to identify non-obvious patterns that are usually hidden behind large volumes of information. It uses mutually complementary methods for detecting knowledge that have gained the greatest commercial distribution in world practice: clustering (grouping with respect to similar objects), the search for associations (the search for stable combinations of events and objects), and the decision tree (building a causal hierarchy of conditions leading to certain decisions). Query Console and Report Console. Both of these consoles are not part of the technology platform, but are external reports that can be run in any application. They help the developer or experienced user to compose the text of the request and analyze its results or issue an arbitrary report accordingly.

Data Analysis and Forecasting Mechanism - This is one of the mechanisms for the formation of economic and analytical reporting. It provides users (economists, analysts, etc.) with the opportunity to search for non-obvious patterns in the data accumulated in the information base. This mechanism allows you to:

search for patterns in the source data of the information base;
manage the parameters of the analysis performed both programmatically and interactively;
provide programmatic access to the analysis result;
automatically display the result of the analysis in a spreadsheet document;
create forecast models that automatically predict subsequent events or the values \u200b\u200bof certain characteristics of new objects.

The data analysis mechanism is a set of interacting objects of the embedded language, which allows the developer to use its components in an arbitrary combination in any application solution. Built-in objects make it easy to organize interactive tuning of analysis parameters by the user, and also allow you to display the analysis result in a convenient form for displaying in a spreadsheet document.

The mechanism allows you to work with data obtained from the infobase, as well as data received from an external source, previously loaded into a table of values \u200b\u200bor a tabular document:

Applying one type of analysis to the source data, you can get the result of the analysis. The result of the analysis is a kind of data behavior model. The result of the analysis can be displayed in the final document, or saved for future use.

Further use of the analysis result is that based on it, a forecast model can be created that allows predicting the behavior of new data in accordance with the existing model.

For example, you can analyze which products are purchased together (on one invoice) and save this analysis result in a database. In the future, when creating the next invoice on the basis of the saved analysis result, you can build a forecast model, submit to it “input” the new data contained in this invoice, and “get” the forecast — the list of goods that counterparty Petrov BS also, most likely, will get, if you offer them to him:

The data analysis and forecasting mechanism implements several types of data analysis:

Implemented Analysis Types

general Statistics

It is a mechanism for collecting information about the data in the study sample. This type of analysis is intended for a preliminary study of the analyzed data source.

The analysis shows a number of characteristics of continuous and discrete fields. Continuous fields contain types such as Number, date. For other types, discrete fields are used. When a report is output to a spreadsheet document, pie charts are filled to display the composition of the fields.

Association Search

This type of analysis searches for frequently encountered groups of objects or characteristic values, and also searches for association rules. The search for associations can be used, for example, to identify frequently purchased goods or services together:

This type of analysis can work with hierarchical data, which allows, for example, to find rules not only for specific products, but also for their groups. An important feature of this type of analysis is the ability to work both with an object data source, in which each column contains some characteristic of the object, and with an event source, where the characteristics of the object are located in one column.

To facilitate the perception of the result, a mechanism for cutting off excess rules is provided.

Sequence Search

The type of analysis of the search for sequences allows you to identify sequential chains of events in the data source. For example, it may be a chain of goods or services that customers often sequentially acquire:

This type of analysis allows you to search by hierarchy, which makes it possible to track not only sequences of specific events, but also sequences of parent groups.

The set of analysis parameters allows the specialist to limit the time distances between the elements of the desired sequences, as well as adjust the accuracy of the results.

Cluster analysis

Cluster analysis allows you to divide the initial set of objects under study into groups of objects, so that each object is more similar to objects from its group than to objects of other groups. Analyzing further the obtained groups, called clusters, it is possible to determine what is characterized by a particular group, decide on methods of working with objects of various groups. For example, using cluster analysis, you can divide the clients the company works with into groups in order to apply different strategies when working with them:

Using the parameters of cluster analysis, the analyst can adjust the algorithm by which the partition will be performed, and can also dynamically change the composition of the characteristics taken into account during the analysis, and configure weighting coefficients for them.

The result of clustering can be displayed in the dendrogram - a special object designed to display sequential relationships between objects.

Decision tree

Type of analysis decision tree allows you to build a hierarchical structure of classification rules, presented in the form of a tree.

To build a decision tree, you need to select the target attribute, which will be used to build the classifier and a number of input attributes that will be used to create the rules. The target attribute may contain, for example, information about whether the client has switched to another service provider, whether the transaction was successful, whether the work was performed in a quality manner, etc. Input attributes, for example, may be the employee’s age, length of service, financial status of the client, number of employees in the company, etc.

The result of the analysis is presented in the form of a tree, each node of which contains a certain condition. To decide which class a new object should belong to, it is necessary, answering questions in nodes, to go through the chain from the root to the leaf of the tree, going to child nodes in the case of an affirmative answer and to the neighboring node in the case of a negative.

A set of analysis parameters allows you to adjust the accuracy of the resulting tree:

Forecast Models

The forecast models created by the mechanism are special objects that are created from the result of data analysis and allow further automatic forecasting for new data.

For example, the association search forecast model, built in the analysis of customer purchases, can be used when working with a customer making a purchase in order to offer him products that he will acquire with a certain degree of probability along with his chosen goods.

The data analysis and forecasting mechanism provides users (economists, analysts, etc.) with the opportunity to search for non-obvious patterns in the data accumulated in the information database. This mechanism allows you to:

search for patterns in the source data of the information base;
manage the parameters of the analysis performed both programmatically and interactively;
provide programmatic access to the analysis result;
automatically display the result of the analysis in a spreadsheet document;
create forecast models that automatically predict subsequent events or the values \u200b\u200bof certain characteristics of new objects.

The mechanism allows you to work with data obtained from the infobase, as well as data received from an external source, previously loaded into a table of values \u200b\u200bor a tabular document:

Further use of the analysis result is that based on it, a forecast model can be created that allows predicting the behavior of new data in accordance with the existing model.

For example, you can analyze which products are purchased together (on one invoice) and save this analysis result in a database. In the future, when creating another invoice:

based on the saved analysis result, you can build a forecast model, submit to it “input” the new data contained in this invoice, and “output” receive a forecast - a list of goods that counterparty Petrov BS also, most likely, will get, if you offer them to him:

The data analysis and forecasting mechanism implements several types of data analysis:

Implemented Analysis Types

general Statistics

It is a mechanism for collecting information about the data in the study sample. This type of analysis is intended for a preliminary study of the analyzed data source.

The analysis shows a number of characteristics of numerical and continuous fields. When a report is output to a spreadsheet document, pie charts are filled in to display the composition of the fields.

Association Search

This type of analysis searches for frequently encountered groups of objects or characteristic values, and also searches for association rules. The search for associations can be used, for example, to identify frequently purchased goods or services together:

This type of analysis can work with hierarchical data, which allows, for example, to find rules not only for specific products, but also for their groups. An important feature of this type of analysis is the ability to work both with an object data source, in which each column contains some characteristic of the object, and with an event source, where the characteristics of the object are located in one column.

To facilitate the perception of the result, a mechanism for cutting off excess rules is provided.

Sequence Search

The type of analysis of the search for sequences allows you to identify sequential chains of events in the data source. For example, it may be a chain of goods or services that customers often sequentially acquire:

This type of analysis allows you to search by hierarchy, which makes it possible to track not only sequences of specific events, but also sequences of parent groups.

The set of analysis parameters allows the specialist to limit the time distances between the elements of the desired sequences, as well as adjust the accuracy of the results.

Cluster analysis

Cluster analysis allows you to divide the initial set of objects under study into groups of objects, so that each object is more similar to objects from its group than to objects of other groups. Analyzing further the obtained groups, called clusters, it is possible to determine what is characterized by a particular group, decide on methods of working with objects of various groups. For example, using cluster analysis, you can divide the clients the company works with into groups in order to apply different strategies when working with them:

Using the parameters of cluster analysis, the analyst can adjust the algorithm by which the partition will be performed, and can also dynamically change the composition of the characteristics taken into account during the analysis, and configure weighting coefficients for them.

The result of clustering can be displayed in the dendrogram - a special object designed to display sequential relationships between objects.

Decision tree

Type of analysis decision tree allows you to build a hierarchical structure of classification rules, presented in the form of a tree.

To build a decision tree, you need to select the target attribute, which will be used to build the classifier and a number of input attributes that will be used to create the rules. The target attribute may contain, for example, information about whether the client has switched to another service provider, whether the transaction was successful, whether the work was performed in a quality manner, etc. Input attributes, for example, may be the employee’s age, length of service, financial status of the client, number of employees in the company, etc.

The result of the analysis is presented in the form of a tree, each node of which contains a certain condition. To decide which class a new object should belong to, it is necessary, answering questions in nodes, to go through the chain from the root to the leaf of the tree, going to child nodes in the case of an affirmative answer and to the neighboring node in the case of a negative.

A set of analysis parameters allows you to adjust the accuracy of the resulting tree:

Forecast Models

The forecast models created by the mechanism are special objects that are created from the result of data analysis and allow further automatic forecasting for new data.

The use of data analysis in applied solutions

To familiarize developers of applied solutions with the data analysis mechanism, a demo information base is located on the disk "Information Technology Support" (ITS). It includes the universal processing “Data Analysis Console”, which allows you to perform data analysis in any application solution, without finalizing the configuration.

It might be useful to read: