PPGSC - Mestrado em Sistemas e Computação

URI Permanente para esta coleçãohttps://repositorio.ufrn.br/handle/123456789/12059

Navegar

Agora exibindo 1 - 14 de 14

An experimental investigation of letter identification and scribe predictability in medieval manuscripts
(2020-01-16) Nascimento, Francimaria Rayanne dos Santos; Abreu, Marjory Cristiany da Costa; ; ; Carvalho, Bruno Motta de; ; Cavalcante, Everton Ranielly de Sousa; ; Cavalcanti, George Darmiton da Cunha;
Although the handwriting might seem archaic today in comparison with typed communication, it is a long-established human activity that has survived into the 21st century. Accordingly, research interest into handwritten documents, both historical and modern, is significant. The way we write has changed significantly over the past centuries. For example, the texts of the Middle Ages were often written and copied by anonymous scribes. The writing of each scribe, known as his or her ‘scribal hand’ is unique, and can be differentiated using a variety of consciously and unconsciously produced features. Distinguishing between these different scribal hands is a central focus of the humanities research field known as "palaeography". This process may be supported and/or enhanced using digital techniques, and thus digital writer identification from historical handwritten documents has also flourished. The automation of the process of recognising individual characters within each scribal hand has also posed an interesting challenge. Some issues make these digital processes difficult about medieval handwritten documents. These include the degradation of the paper and soiling of the manuscript page. Thus, in this dissertation, we propose an investigation in both perspectives, character recognition and writer identification, in medieval manuscripts in an attempt to better understand the specific behaviour of two 800 year old scribes based on their manuscripts in comparison with a modern calligrapher. The experiments evidenced that the degradation, and the tremor (when present), can influence the analysis of old handwriting documents. However, the results presented an efficient accuracy with a better accuracy rate in the classification of the letter than in writer identification.
An investigation of biometric-based user predictability in the online game League of Legends
(2019-02-07) Silva, Valmiro Ribeiro da; Abreu, Marjory Cristiany da Costa; ; ; Canuto, Anne Magaly de Paula; ; Souza Neto, Placido Antonio de;
Computer games have been consolidated as a favourite activity for years now. Although such games were created to promote competition and promote self-improvement, there are some recurrent issues. One that has received the least amount of attention so far is the problem of "account sharing" which is when a player shares his/her account with more experienced players in order to progress in the game. The companies running those games tend to punish this behaviour, but this specific case is hard to identify. Since, the popularity of machine learning techniques have never been higher, the aim of this study is to better understand how biometric data from online games behaves, to understand how the choice of character impacts a player and how different algorithms perform when we vary how frequently a sample is collected. The experiments showed through the use of statistic tests how consistent a player can be even when he/she changes characters or roles, what are the impacts of more training samples, how the tested machine learning algorithms results are affected by how often we collect our samples, and how dimensionality reduction techniques, such as Principal Component Analysis affect our data, all providing more information about how this state of art game database works.
An investigative analysis of obvious and non-obvious Bias in judicial data using supervised and unsupervised machine learning techniques
(Universidade Federal do Rio Grande do Norte, 2021-07-05) Silva, Bruno dos Santos Fernandes da; Abreu, Marjory Cristiany da Costa; http://lattes.cnpq.br/2234040548103596; http://lattes.cnpq.br/9229268386945230; Cavalcante, Everton Ranielly de Sousa; http://lattes.cnpq.br/5065548216266121; Oliveira, Laura Emmanuella Alves dos Santos Santana de; http://lattes.cnpq.br/8996581733787436; Souza Neto, Plácido Antônio de; http://lattes.cnpq.br/3641504724164977
Brazilian Courts have been working in virtualisation of judicial processes since this century’s rise and, since then, a massive volume of data has been produced. Computational techniques have been an intimate ally to face the increasing amount of accumulated and new lawsuits in the system. However, although there is a misunderstanding that automation solutions are always ’intelligent’, which in most cases, it is not valid, there has never been any discussion about the use of intelligent solutions for this end as well as any issues related to automatic predicting and decision making using historical data in context. One of the problems that have already come to light is the bias in judicial data sets worldwide. This work aims to analyse a judicial dataset looking for decision bias and intelligent algorithms suitability. Taking motivation from the social impact of bias in the decision-making process, we have selected gender and social condition of indicted as classes for investigation. We have used a dataset of judicial sentences (built by Além da Pena research group), identified data structure and distribution, created supervised and unsupervised machine learning models applied to the dataset and analysed the occurrence of obvious and non-obvious bias related to judicial decisions. To investigate obvious bias, classification techniques based on k-Nearest Neighbours, Naive Bayes and Decision Trees algorithms, and to non-obvious bias, the unsupervised algorithms like k-Means and Hierarchical Clustering. Our experiments have been conducted to results that do not achieve a conclusive detection of bias but suggest a trend that would confirm its occurrence in the dataset, and therefore, the need for deeper analysis and improvements of techniques.
Analisando o desempenho do ClassAge: um sistema multiagentes para classiﬁcação de padrões
(Universidade Federal do Rio Grande do Norte, 2006-10-26) Abreu, Marjory Cristiany da Costa; Canuto, Anne Magaly de Paula; ; http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4790093J8; ; http://lattes.cnpq.br/2234040548103596; Carvalho, André Carlos Ponce de Leon Ferreira de; ; http://lattes.cnpq.br/9674541381385819; Campos, André Mauricio Cunha; ; http://lattes.cnpq.br/7154508093406987
A utilização de sistemas baseados no paradigma dos agentes para resolução de problemas de reconhecimento de padrões vem sendo propostos com o intuito de resolver, ou atenuar, o problema de tomada de decisão centralizada dos sistemas multi-classiﬁcadores e, como consequência, melhorar sua capacidade de classiﬁcação. Com a intenção de solucionar este problema, o Sistema NeurAge foi proposto. Este sistema é composto por agentes neurais que podem se comunicar e negociar um resultado comum para padrões de teste. No Sistema NeurAge, os métodos de negociação são muito importantes para prover uma melhor precisão ao sistema, pois os agentes necessitam alcançar a melhor solução e resolver conﬂitos, quando estes existem, em relação a um problema. Esta dissertação apresenta uma extensão do Sistema NeurAge que pode utilizar qualquer tipo de classiﬁcador e agora será chamado de Sistema ClassAge. Aqui é feita uma análise do comportamento do Sistema ClassAge diante de várias modiﬁcações na topologia e nas conﬁgurações dos componentes deste sistema
Face biometrics for differentiating typical development and autism spectrum disorder: a methodology for collecting and evaluating a dataset
(Universidade Federal do Rio Grande do Norte, 2022-09-16) Budke, Jaine Rannow; Abreu, Marjory Cristiany da Costa; https://orcid.org/0000-0001-7461-7570; http://lattes.cnpq.br/2234040548103596; http://lattes.cnpq.br/6545013954007575; Carvalho, Bruno Motta de; Souza Neto, Plácido Antônio de; https://orcid.org/0000-0003-1233-4510; http://lattes.cnpq.br/3641504724164977
O Transtorno do Espectro Autista (TEA) é um transtorno de neurodesenvolvimento marcado por déficits na comunicação e interação social. O protocolo padrão de diagnóstico é baseado no preenchimento de critérios descritivos por um profissional qualificado, o que não estabelece medidas precisas e influencia no diagnóstico tardio. Portanto, novas abordagens diagnósticas precisam ser exploradas para que haja uma melhor padronização das práticas clínicas. O melhor cenário seria a existência de um sistema automatizado e confiável que indicasse o diagnóstico com um nível de garantia satisfatório. Contudo, até o momento, não há bases de dados públicas e representativas com o objetivo de explorar diagnósticos alternativos. Esse trabalho investiga as diferenças nas expressões faciais de pessoas com TEA e Desenvolvimento Típico. Para isso, uma nova base de dados de imagens faciais foi coletada através de vídeos do YouTube e técnicas baseadas em visão computacional foram utilizadas para extrair frames dos vídeos, filtrar a base de dados e extrair características faciais das imagens. Também realizamos experimentos iniciais usando modelos clássicos de aprendizado supervisionado, bem como ensembles, e conseguimos atingir resultados promissores.
A framework for investigating the use of face features to identify spontaneous emotions
(Universidade Federal do Rio Grande do Norte, 2014-12-12) Bezerra, Giuliana Silva; Abreu, Marjory Cristiany da Costa; ; http://lattes.cnpq.br/2234040548103596; ; http://lattes.cnpq.br/0962295420081741; Carvalho, Bruno Motta de; ; http://lattes.cnpq.br/0330924133337698; Lopes, Fívia de Araújo; ; http://lattes.cnpq.br/2583445528542625; Schwartz, William Robson; ; http://lattes.cnpq.br/0704592200063682
Emotion-based analysis has raised a lot of interest, particularly in areas such as forensics, medicine, music, psychology, and human-machine interface. Following this trend, the use of facial analysis (either automatic or human-based) is the most common subject to be investigated once this type of data can easily be collected and is well accepted in the literature as a metric for inference of emotional states. Despite this popularity, due to several constraints found in real world scenarios (e.g. lightning, complex backgrounds, facial hair and so on), automatically obtaining affective information from face accurately is a very challenging accomplishment. This work presents a framework which aims to analyse emotional experiences through naturally generated facial expressions. Our main contribution is a new 4-dimensional model to describe emotional experiences in terms of appraisal, facial expressions, mood, and subjective experiences. In addition, we present an experiment using a new protocol proposed to obtain spontaneous emotional reactions. The results have suggested that the initial emotional state described by the participants of the experiment was different from that described after the exposure to the eliciting stimulus, thus showing that the used stimuli were capable of inducing the expected emotional states in most individuals. Moreover, our results pointed out that spontaneous facial reactions to emotions are very different from those in prototypic expressions due to the lack of expressiveness in the latter.
Um framework semissupervisionado para classificação de dados em fluxos contínuos
(Universidade Federal do Rio Grande do Norte, 2021-06-25) Gorgônio, Arthur Costa; Canuto, Anne Magaly de Paula; Vale, Karliane Medeiros Ovidio; 02973877407; http://lattes.cnpq.br/7907570677010860; http://lattes.cnpq.br/1357887401899097; http://lattes.cnpq.br/8213279977425231; Abreu, Marjory Cristiany da Costa; http://lattes.cnpq.br/2234040548103596; Xavier Júnior, João Carlos; http://lattes.cnpq.br/5088238300241110; Santos, Araken de Medeiros; http://lattes.cnpq.br/8059198436766378
Aplicações no domínio de fluxos contínuos de dados (do inglês, Data Streams) recebem um grande volume de dados rapidamente e, existe a necessidade de processá-los sequencialmente. Uma característica destas aplicações é que os dados podem sofrer mudanças durante o processo da utilização do modelo, ademais a quantidade de instâncias cujo rótulo é conhecido pode não ser suficiente para gerar um modelo eficaz. A fim de suprimir a dificuldade da pouca quantidade de instâncias rotulada, pode-se utilizar o aprendizado semissupervisionado. Além disso, o uso de comitês de classificadores pode auxiliar na detecção da mudança de contexto. Assim, neste trabalho, é proposto um framework para realizar a classificação semissupervisionada em tarefas com fluxos contínuos de dados, utilizando uma abordagem baseada em comitês de classificadores. Este framework utiliza o comitê para se auto avaliar e determinar quando treinar um novo classificador durante o processo de classificação. Para avaliar a eficácia da proposta, foram realizados testes empíricos com onze bases de dados utilizando dois diferentes tamanhos de batch, nove abordagens supervisionadas , por meio das métricas acurácia, precision, recall e f-score. Ao avaliar a quantidade de instâncias processadas, as abordagens supervisionadas obtiveram um desempenho praticamente constantes, enquanto que a proposta apresentou uma melhora de 8,28% e 3,81% utilizando 5% e 10% de instâncias rotuladas, respectivamente. Por fim, os resultados desta pesquisa são promissores, o framework proposto obteve resultados semelhantes ou superiores em 118 dos 198 (60%) casos, em termos estatísticos.
Investigating fuzzy methods for multilingual speaker identification
(Universidade Federal do Rio Grande do Norte, 2020-08-27) Lima, Thales Aguiar de; Abreu, Marjory Cristiany da Costa; ; ; Santin, Altair Olivo; ; Pereira, Mônica Magalhães;
Speech is a crucial ability for humans to interact and communicate. Speech-based technologies are becoming more popular with speech interfaces, real-time translation, and budget healthcare diagnosis. Besides, the use of voice for system identification is an important and relevant topic. There are several ways of doing it, but most are dependent on the language the user speaks. However, if the idea is to create an all inclusive and reliable system that uses speech as its input, we must take into account that people can and will speak different languages and accents. This research evaluates closed-set text-independent speaker identification systems on a multilingual setup, including both fuzzy and crisp models. Our experiments are performed using three widely spoken languages which are Portuguese, English, and Chinese. Then, we extracted 13-MFCCs, along with log-Energy and its respective delta and delta-delta from signals to use as our feature vector. We adopted four classifiers: Fuzzy C-Means, Fuzzy k-Nearest Neighbours, k-Nearest Neighbours, and Support Vector Machines. Initial tests indicated the systems have certain robustness on multiple languages. Where results with more languages decreases our accuracy; however our investigation suggests these impacts are from number of classes.
Predspot: predicting crime hotspots with machine learning
(2019-09-24) Araújo Júnior, Adelson Dias de; Cacho, Nélio Alessandro Azevedo; Bezerra, Leonardo César Teonácio; ; ; ; Abreu, Marjory Cristiany da Costa; ; Kounadi, Ourania;
Smart cities are increasingly adopting data infrastructure and analysis to improve the decision-making process for public safety issues. Although traditional hotspot policing methods have shown benefits in reducing crime, previous studies suggest that the adoption of predictive techniques can produce more accurate estimates for future crime concentration. In previous work, we proposed a framework to generate near-future hotspots using spatiotemporal features. In this work, we redesign the framework to support (i) the widely used crime mapping method kernel density estimation (KDE); (ii) geographic feature extraction with data from OpenStreetMap; (iii) feature selection, and; (iv) gradient boosting regression. Furthermore, we have provided an open-source implementation of the framework to support efficient hotspot prediction for police departments that cannot afford proprietary solutions. To evaluate the framework, we consider data from two cities, namely Natal (Brazil) and Boston (US), comprising twelve crime scenarios. We take as baseline the common police prediction methodology also employed in Natal. Results indicate that our predictive approach estimates hotspots 1.6-3.1 times better than the baseline, depending on the crime mapping method and machine learning algorithm used. From a feature importance analysis, we found that features from trend and seasonality were the most essential components to achieve better predictions.
A probabilistic analysis of the biometrics menagerie existence: case study in fingerprint data
(Universidade Federal do Rio Grande do Norte, 2016-02-18) Araújo, Rayron Victor Medeiros de; Abreu, Marjory Cristiany da Costa; ; http://lattes.cnpq.br/2234040548103596; ; http://lattes.cnpq.br/3173912637773195; Carvalho, Bruno Motta de; ; http://lattes.cnpq.br/0330924133337698; Araújo, Daniel Sabino Amorim de; ; http://lattes.cnpq.br/4744754780165354; Cavalcanti, George Darmiton da Cunha; ; http://lattes.cnpq.br/8577312109146354
Até pouco tempo atrás o uso de biometria se restringia a ambientes de alta segurança e aplicações de identificação criminal por razões de natureza econômica e tecnológica. Contudo, nos últimos anos a autenticação biométrica começou a fazer parte do dia a dia das pessoas. Desde então, alguns problemas de autenticação entraram em evidência, como a impossibilidade de votar numa eleição porque o indivíduo não tinha sua impressão digital reconhecida. Isso acontece, pois os usuários de um sistema biométrico podem ter diferentes graus de acurácia, principalmente em sistemas de utilização em larga escala. Alguns desses usuários podem ter dificuldade na autenticação, enquanto outros podem ser, particularmente, vulneráveis à imitação. Estudos recentes investigaram e identificaram esses tipos de usuários, dando-lhes nomes de animais: Sheep, Goats, Lambs, Wolves, Doves, Chameleons, Worms e Phantoms. O objetivo desse trabalho é avaliar a existência desses tipos de usuários em uma base de dados de impressões digitais e propor uma nova forma de investigá-los, baseando-se no desempenho das verificações entre amostras. Nossos resultados identificaram a presença de goats, lambs, wolves, chameleons e phantoms, além de demonstrar a ausência de worms e doves, em um sistema biométrico proposto.
Quantum computing application in super-resolution
(2019-07-31) Alves, Ystallonne Carlos da Silva; Carvalho, Bruno Motta de; ; ; Santos, Araken de Medeiros; ; Abreu, Marjory Cristiany da Costa;
Super-Resolution (SR) is a technique that has been exhaustively exploited and incorporates strategic aspects to image processing. As quantum computers gradually evolve and provide unconditional proof of a computational advantage at solving intractable problems over their classical counterparts, quantum computing emerges with the compelling argument of offering exponential speed-up to process computationally expensive operations. Envisioning the design of parallel, quantum-ready algorithms for near-term noisy devices and igniting Rapid and Accurate Image Super Resolution (RAISR), an implementation applying variational quantum computation is demonstrated for enhancing degraded imagery. This study proposes an approach that combines the benefits of RAISR, a non hallucinating and computationally efficient method, and Variational Quantum Eigensolver (VQE), a hybrid classical-quantum algorithm, to conduct SR with the support of a quantum computer, while preserving quantitative performance in terms of Image Quality Assessment (IQA). It covers the generation of additional hash-based filters learned with the classical implementation of the SR technique, in order to further explore performance improvements, produce images that are significantly sharper, and induce the learning of more powerful upscaling filters with integrated enhancement effects. As a result, it extends the potential of applying RAISR to improve low quality assets generated by low cost cameras, as well as fosters the eventual implementation of robust image enhancement methods powered by the use of quantum computation.
A study about the impact of combining keystroke and handwriting dynamics on gender and emotional state prediction
(2020-04-03) Bandeira, Danilo Rodrigo Cavalcante; Canuto, Anne Magaly de Paula; ; ; Nascimento, Diego Silveira Costa; ; Abreu, Marjory Cristiany da Costa;
The use of soft biometrics as an auxiliary tool for hard biometrics on user identificationbased systems is already well known. It is not, however, the only use possible for soft biometric data, beyond assist hard biometrics, those modalities can also be the predicted from them. Gender, hand-orientation and emotional state are some examples, which can be called soft biometrics. It is very common in the literature the use of physiological hard biometric modalities for soft biometric prediction, but the behavioral data is often neglected. Two possible behavioral modalities that are not often found in the literature are keystroke and handwriting dynamics, which can be seen used alone to predict the user’s gender and emotional state, but not in any kind of combination scenario. To fill this space, this study aims to investigate whether the combination of those two different biometric modalities can impact the gender and emotional state prediction accuracy. In this sense two combination methods were proposed, the data fusion and the decision fusion, with the decision fusion presenting two variation, the first using mixture of experts and the second using ensembles. The achieved results by the proposed methods were compared to the biometric modalities individually, with a substantially improvement being noticed in most combination scenarios. Lastly, all the presented results were confirmed by the application of statistical tests.
The impact of feature selection methods on online handwritten signature by using clustering-based analysis
(Universidade Federal do Rio Grande do Norte, 2021-01-29) Marques, Julliana Caroline Gonçalves de Araújo Silva; Abreu, Marjory Cristiany da Costa; ; http://lattes.cnpq.br/2234040548103596; ; http://lattes.cnpq.br/5554033822360657; Carvalho, Bruno Motta de; ; http://lattes.cnpq.br/0330924133337698; Souza Neto, Plácido Antônio de; ; http://lattes.cnpq.br/3641504724164977
Handwritten signature is one of the oldest and most accepted biometric authentication methods for human identity establishment in society. With the popularisation of computers and, consequently, computational biometric authentication systems, the signature was chosen for being one of the biometric traits of an individual that is likely to be relatively unique for every person. However, when dealing with biometric data, including signature data, problems related to high dimensional space, can be generated. Among other issues, irrelevant, redundant data and noise are the most significant, as they result in a decreased of identification accuracy. Thus, it is necessary to reduce the space by selecting the smallest set of features that contain the most discriminative features, increasing the accuracy of the system. In this way, our proposal in this work is to analyse the impact of feature selection on individuals identification accuracy based on the handwritten online signature. For this, we will use two well-known online signature databases: SVC2004 and xLongSignDB. For the feature selection process, we have applied two filter and one wrapper methods. Then, the resulted datasets are evaluated by classification algorithms and validated with a clustering technique. Besides, we have used a statistical test to corroborate our conclusions. Experiments presented satisfactory results when using a smaller number of features which are more representative, showing that we reached an average accuracy of over 98\% for both datasets which were validated with the clustering methods, which achieved an average accuracy over 80\% (SVC2004) and 70\% (xLongSignDB).
Using semi-supervised learning models for creating a new fake news dataset from Twitter posts: a case study on Covid-19 in the UK and Brazil
(Universidade Federal do Rio Grande do Norte, 2022-01-14) Nascimento, Tuany Mariah Lima do; Abreu, Marjory Cristiany da Costa; Oliveira, Laura Emmanuella Alves dos Santos Santana de; 05069886436; http://lattes.cnpq.br/8996581733787436; http://lattes.cnpq.br/2234040548103596; Cavalcante, Everton Ranielly de Sousa; http://lattes.cnpq.br/5065548216266121; Souza Neto, Placido Antônio de; http://lattes.cnpq.br/3641504724164977
Fake News has been a big problem for society for a long time. It has been magnified, reaching worldwide proportions, mainly with the growth of social networks and instant chat platforms where any user can quickly interact with news, either by sharing, through likes and retweets or presenting hers/his opinion on the topic. Since this is a very fast phenomenon, it became humanly impossible to manually identify and highlight any fake news. Therefore, the search for automatic solutions for fake news identification, mainly using machine learning models, has grown a lot in recent times, due to the variety of topics as well as the variety of fake news propagated. Most solutions focus on supervised learning models, however, in some datasets, there is an absence of labels for most of the instances. For this, the literature presents the use of semi-supervised learning algorithms which are able to learn from a few labeled data. Thus, this work will investigate the use of semi-supervised learning models for the detection of fake news, using as a case study the outbreak of the Sars-CoV-2 virus, the COVID-19 pandemic. Our results have shown that we have an interesting methodology which can be used to built a new social media dataset and automatic label the samples using semi-supervised learning models. We also have as an important contribution a new fake news dataset.

SIGAA

Navegar

Navegando PPGSC - Mestrado em Sistemas e Computação por Autor "Abreu, Marjory Cristiany da Costa"