The Material and Methods section includes the explanation of the assay procedure and the experimental setup. In many cases the physiological biochemical reaction is not used for high throughput screening the measurement but alternative substrates are included in the experimental setup. The Results part describes in detail the measured and analyzed data which are frequently represented in tables and figures. Sometimes this
section already contains the Discussion of the results which relates and compares the information to data from other experimentalists. The Discussion or Summary concludes and often repeats parts of the results. This classical paper structure results in a scattering of the relevant data in the paper: Figure 1 shows six pages of a selected full paper containing a color-coded representation of the distribution of different data within a publication. The colors are used to distinguish between different types of information (e.g. protein data, relevant experimental methods, or kinetic data). Figure 1 also represents the same data structured in an SABIO-RK database entry. The data described within the example publication results in 23 different entries in SABIO-RK, each entry having the same structure. The segregation of related data within a paper makes automatic information extraction very difficult. Without understanding of the complete paper, it is almost impossible to collect and restructure the data in a correct
way. Therefore the available tools for automatic information extraction are not suitable for the full extraction ABT-263 order process. For example, if there is a description of a kinetic law equation used for the determination of kinetic parameters all values given in the equation should be extracted and inserted Dolutegravir datasheet in the database entry. For the example paper in Figure 1 passages in the text containing kinetic parameters and data about
the mathematical equation are highlighted in green showing that the data are distributed in the text and also written in tables and displayed in figures. This is a typical way of writing it in a paper. Based on these findings we investigated the distribution and representation format of kinetic parameters within the above mentioned list of about 300 articles. Kinetic parameters (e.g. Km, Ki, kcat, Vmax) which are important for the description of enzyme and reaction characteristics and comprise the key data of the SABIO-RK database can be found in three types of representations, in (i) free text, (ii) tables and (iii) figures. Such an inconsistent representation makes it hard to use or develop automatic information-extraction methods. Parameters are described in free text in 80% of the analyzed articles, displayed in tables in about 65% and in figures in about 8%. In 31.8% of the publications parameters are only within free text and in 18.2% only in tables. About 42% of the papers have parameters both in text and tables.