Nuevos Algoritmos Basados en Grafos y Clustering para el Tratamiento de Complejidades de los Datos

Gúzman Ponce, Angélica

Mostrar el registro sencillo del objeto digital

dc.contributor	Sánchez Garreta, José Salvador
dc.contributor.advisor	Valdovinos Rosas, Rosa María ; 211910
dc.contributor.advisor	Sánchez Garreta, José Salvador;#0000-0003-1053-4658
dc.contributor.author	Gúzman Ponce, Angélica
dc.creator	Gúzman Ponce, Angélica; 702275
dc.date.accessioned	2021-05-14T00:41:41Z
dc.date.available	2021-05-14T00:41:41Z
dc.date.issued	2021-03-17
dc.identifier.uri	http://hdl.handle.net/20.500.11799/110464
dc.description	Doctoral thesis	es
dc.description.abstract	Nowadays, knowledge extraction from data is an essential task for decisionmaking in many areas. However, the data sets commonly present some negative problems (complexities) that decrease the performance in the knowledge extraction process. The imbalanced distribution of data between classes and the presence of noise and/or class overlap are data intrinsic characteristics that frequently decrease the performance of the knowledge extraction because data are assumed to keep a uniform distribution and free from any other problem. All these issues have been studied in Pattern Recognition and Data Mining, because of their impact on the performance of the learning models. Thus this Ph.D. thesis addresses class imbalance, class overlap and/or noise through techniques that reduce and clean the most represented class. Among the solutions to handle with the class imbalance problem, new algorithms based on graphs are proposed. This idea arises from the fact that many real-world problems (network analysis, chemical models, remote sensing, among others) have been tackled by using graph-based strategies, in which the problem is transformed in terms of vertices and edges. Keeping this in mind, the proposals presented in this Ph.D. thesis consider the most represented class as as a complete graph in such a way that a representative subset of majority class instances is obtained through reduction criteria. Regarding the data sets with class imbalance and class overlap and/or noise, the proposals include the use of clustering algorithms as a cleaning strategy. It is well known that these algorithms are used to group instances according to similar characteristics; however, the proposal here presented makes use of their ability to detect noisy instances. By this, the application of a clustering algorithm is carried out before facing the class imbalance. As a further extension to the proposals presented in this Ph.D. thesis and due to the growing interest in Big Data problems, the last part of this report introduces a graph-based algorithm to handle class imbalance in large-scale data sets.	es
dc.description.sponsorship	Becas nacionales del CONACYT	es
dc.language.iso	spa	es
dc.publisher	Universidad Autónoma del Estado de México	es
dc.rights	openAccess	es
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0
dc.subject	Research Subject Categories	es
dc.subject	Graphs	es
dc.subject	Pattern Recognition	es
dc.subject	Data mining	es
dc.subject	Clustering	es
dc.subject.classification	INGENIERÍA Y TECNOLOGÍA	es
dc.subject.classification	INGENIERÍA Y TECNOLOGÍA
dc.title	Nuevos Algoritmos Basados en Grafos y Clustering para el Tratamiento de Complejidades de los Datos	es
dc.title.alternative	New Algorithms Based on Graphs and Clustering for handling Data Complexities	es
dc.type	Tesis de Doctorado	es
dc.provenance	Científica	es
dc.road	Verde	es
dc.organismo	Ingeniería	es
dc.ambito	Internacional	es
dc.cve.CenCos	20501	es
dc.cve.progEstudios	1009	es
dc.modalidad	Tesis	es
dc.audience	students	es
dc.audience	researchers	es
dc.type.conacyt	doctoralThesis
dc.identificator	7

Ficheros en el objeto digital

Nombre: Nuevos_Algoritmos ...

Tamaño: 3.898Mb

Formato: PDF

Descripción: Tesis doctora

Ver documento

Este ítem aparece en la(s) siguiente(s) colección(ones)

Conacyt [10019]
Científica [363]

Visualización del Documento

Título
Nuevos Algoritmos Basados en Grafos y Clustering para el Tratamiento de Complejidades de los Datos
Autor
Gúzman Ponce, Angélica
Director(es) de tesis, compilador(es) o coordinador(es)
Sánchez Garreta, José Salvador
Fecha de publicación
2021-03-17
Editor
Universidad Autónoma del Estado de México
Tipo de documento
Tesis de Doctorado
Palabras clave
Research Subject Categories
Graphs
Pattern Recognition
Data mining
Clustering