Bone Age Modelling

My photographIn this project, we focus on processing Anthropological data by means of several data mining methods. The goal is to predict an age of individuals described by a set of parameters measured on their skeletons. Data in this project are problematic due to very high noise. Methods are tuned and parameterized to give best possible performance on data. The performance of methods is compared and the recommendation, how to process noisy and partially inconsistent data will be one of the final conclusions of this project.

Publications

  • Pavel Kordik: Regularization of Evolving Polynomial Models. In: Proceeding of Internation Workshop on Inductive Modelling (IWIM 2007), , 2007. ISBN ISBN 978-80-01-03881-9 BibTex, PDF
  • P. Kord'{i}k: Fully Automated Knowledge Extraction using Group of Adaptive Models Evolution. At: , Czech Technical University in Prague, FEE, Dep. of Comp. Sci. and Computers, 2006 BibTex, PDF

    Keywords like data mining (DM) and knowledge discovery (KD) appear in several thousands of articles in recent time. Such popularity is driven mainly by demand of private companies. They need to analyze their data effectively to get some new useful knowledge that can be capitalized. This process is called knowledge discovery and data mining is a crucial part of it. Although several methods and algorithms for data mining has been developed, there is still a lot of gaps to fill. The problem is that real world data are so diverse that no universal algorithm has been developed to mine all data effectively. Also stages of the knowledge discovery process need the full time assistance of an expert on data preprocessing, data mining and the knowledge extraction. These problems can be solved by a KD environment capable of automatical data preprocessing, generating regressive, predictive models and classifiers, automatical identification of interesting relationships in data (even in complex and high-dimensional ones) and presenting discovered knowledge in a comprehensible form. In order to develop such environment, this thesis focuses on the research of methods in the areas of data preprocessing, data mining and information visualization. The Group of Adaptive Models Evolution (GAME) is data mining engine able to adapt itself and perform optimally on big (but still limited) group of realworld data sets. The Fully Automated Knowledge Extraction using GAME (FAKE GAME) framework is proposed to automate the KD process and to eliminate the need for the assistance of data mining expert. The GAME engine is the only GMDH type algorithm capable of solving very complex problems (as demonstrated on the Spiral data benchmarking problem). It can handle irrelevant inputs, short and noisy data samples. It uses an evolutionary algorithm to find optimal topology of models. Ensemble techniques are employed to estimate quality and credibility of GAME models. Within the FAKE framework we designed and implemented several modules for data preprocessing, knowledge extraction and for visual knowledge discovery.