{"id":13476,"date":"2014-10-29T22:39:13","date_gmt":"2014-10-29T22:39:13","guid":{"rendered":"http:\/\/obegef.pt\/wordpress\/?p=13476"},"modified":"2014-10-29T22:39:13","modified_gmt":"2014-10-29T22:39:13","slug":"32-working-paper","status":"publish","type":"post","link":"https:\/\/obegef.pt\/wordpress\/?p=13476","title":{"rendered":"32# Working Paper"},"content":{"rendered":"<table border=\"0\" cellspacing=\"2\" cellpadding=\"2\">\n<tbody>\n<tr>\n<td align=\"center\" bgcolor=\"#c0c0c0\" width=\"270\"><a href=\"http:\/\/obegef.pt\/wordpress\/wp-content\/uploads\/2014\/10\/wp032.pdf\" target=\"_blank\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-13478 size-full\" src=\"http:\/\/obegef.pt\/wordpress\/wp-content\/uploads\/2014\/10\/wp032.jpg\" alt=\"\" width=\"250\" height=\"176\" \/><\/a><\/td>\n<td valign=\"top\"><strong>Autor:<\/strong> Concei\u00e7\u00e3o Rocha, Al\u00edpio M\u00e1rio Jorge, M\u00e1rcia Oliveira, Paula Brito, Jo\u00e3o Gama, Carlos Pimenta<br \/>\n<strong>T\u00edtulo: <\/strong>From entity extraction to network analysis: a method and an application to a Portuguese textual source<br \/>\n<strong>Editor:<\/strong>\u00a0Edi\u00e7\u00f5es H\u00famus &amp; OBEGEF<br \/>\n<strong>Data:<\/strong>\u00a02014, Nov.<br \/>\n<strong>P\u00e1ginas:\u00a0<\/strong>20<br \/>\n\u00a9: Observat\u00f3rio de Economia e Gest\u00e3o de Fraude<strong><strong><br \/>\n<\/strong><\/strong><strong>Formato ficheiro:<\/strong>\u00a0pdf (portable document format)<br \/>\n<strong>Dimens\u00e3o: <\/strong>4012 kb<br \/>\n<strong>Solicita\u00e7\u00e3o:<\/strong>\u00a0<a href=\"mailto:geral@gestaodefraude.eu?Subject=Sobre%20%20wp027\">Transmita-nos<\/a>\u00a0a sua opini\u00e3o sobre este trabalho.<br \/>\n<!--more--><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify;\">(Carregue na imagem para importar o livro)<\/p>\n<p style=\"text-align: justify;\"><strong><strong>Resumo:<\/strong>\u00a0<\/strong><\/p>\n<p style=\"text-align: justify;\"><a href=\"http:\/\/obegef.pt\/wordpress\/wp-content\/uploads\/2009\/01\/pt.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-9 alignleft\" src=\"http:\/\/obegef.pt\/wordpress\/wp-content\/uploads\/2009\/01\/pt.jpg\" alt=\"texto em portugu\u00eas\" width=\"24\" height=\"24\" \/><\/a>Este artigo d\u00e1 a conhecer os avan\u00e7os conseguidos na extra\u00e7\u00e3o de entidades (identifica\u00e7\u00e3o de entidades referidas) num processo de minera\u00e7\u00e3o de texto cujo objetivo \u00e9 revelar estruturas sem\u00e2nticas n\u00e3o triviais, tais como rela\u00e7\u00f5es e intera\u00e7\u00f5es entre as entidades ou comunidades. \u00c9 proposto um m\u00e9todo de tr\u00eas fases aplic\u00e1vel \u00e0 l\u00edngua Portuguesa e potencialmente a outras l\u00ednguas. O m\u00e9todo baseia-se em correspond\u00eancia de padr\u00f5es flex\u00edvel, na marca\u00e7\u00e3o da categoria morfo-sint\u00e1tica de cada palavra, em regras lexicais e na dist\u00e2ncia entre os nomes das entidades. Todas as etapas s\u00e3o implementadas em software livre usando v\u00e1rios pacotes dispon\u00edveis. A avalia\u00e7\u00e3o da efic\u00e1cia do m\u00e9todo de extra\u00e7\u00e3o de entidades \u00e9 feita tendo por base uma parte de um livro escrito em portugu\u00eas observando-se uma melhoria na medida F1. Para uma melhor compreens\u00e3o e avalia\u00e7\u00e3o da utilidade do m\u00e9todo proposto apresentamos um caso de um livro sobre Ma\u00e7onaria. \u00c9 tamb\u00e9m definida uma rede social das entidades referidas com base exclusivamente em cita\u00e7\u00f5es do livro. Da\u00ed s\u00e3o extra\u00eddas informa\u00e7\u00f5es estruturais que revelam conex\u00f5es, relacionamentos e comunidades entre as entidades.<\/p>\n<p style=\"text-align: justify;\"><a id=\"resumo\" href=\"http:\/\/obegef.pt\/images\/gf_php\/gfindex.php?p=f&amp;publ=wp030\" target=\"resumo\" name=\"resumo\"><br \/>\n<\/a><a href=\"http:\/\/obegef.pt\/wordpress\/wp-content\/uploads\/2009\/01\/en.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-11 alignleft\" src=\"http:\/\/obegef.pt\/wordpress\/wp-content\/uploads\/2009\/01\/en.jpg\" alt=\"Texto em ingl\u00eas\" width=\"25\" height=\"25\" \/><\/a>This paper reports advances in the entity extraction task (named entity identification) of a text mining process that aims at unveiling non-trivial semantic structures, such as relationships and interaction between entities or communities. We proposed a 3-phase method that is applicable to the Portuguese language and potentially applicable to other languages as well. The method relies on flexible pattern matching, part-of-speech tagging, lexical-based rules and distance-based entity name merging. All steps are implemented using free software and taking advantage of various existing packages. Evaluation of the efficacy of the entity extraction method on part of a book written in portuguese indicates improved F1 results. For further evaluation and illustration of the usefulness of the proposed method, it is applied to a book on Freemasonry and observe the differences in the entity word clouds produced. We also define a social network of named entities solely from information contained in the book and extract structural insights that reveal connections, relationships and communities between entities.<\/p>\n<p><strong>\u00a9 Direitos de autor<\/strong>:<\/p>\n<p><em><strong>\u00c9 permitida<\/strong>\u00a0a importa\u00e7\u00e3o gratuita.<\/em><br \/>\n<strong>\u00c9 permitida<\/strong>\u00a0a c\u00f3pia de partes deste documento, sem qualquer modifica\u00e7\u00e3o, para utiliza\u00e7\u00e3o individual. A reprodu\u00e7\u00e3o de partes do seu conte\u00fado \u00e9 permitida exclusivamente em documentos cient\u00edficos, com indica\u00e7\u00e3o expressa da fonte.<br \/>\n<strong>N\u00e3o \u00e9 permitida<\/strong>\u00a0qualquer utiliza\u00e7\u00e3o comercial. N\u00e3o \u00e9 permitida a sua disponibiliza\u00e7\u00e3o atrav\u00e9s de rede electr\u00f3nica ou qualquer forma de partilha electr\u00f3nica.<br \/>\n<strong>Em caso de d\u00favida<\/strong>\u00a0ou pedido de autoriza\u00e7\u00e3o, contactar directamente o OBEGEF.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Autor: Concei\u00e7\u00e3o Rocha, Al\u00edpio M\u00e1rio Jorge, M\u00e1rcia Oliveira, Paula Brito, Jo\u00e3o Gama, Carlos Pimenta T\u00edtulo: From entity extraction to network analysis: a method and an application to a Portuguese textual source Editor:\u00a0Edi\u00e7\u00f5es H\u00famus &amp; OBEGEF Data:\u00a02014, Nov. P\u00e1ginas:\u00a020 \u00a9: Observat\u00f3rio de Economia e Gest\u00e3o de Fraude Formato ficheiro:\u00a0pdf (portable document format) Dimens\u00e3o: 4012 kb Solicita\u00e7\u00e3o:\u00a0Transmita-nos\u00a0a&hellip; <a href=\"https:\/\/obegef.pt\/wordpress\/?p=13476\">Ler mais&#8230;<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[71],"tags":[],"class_list":["post-13476","post","type-post","status-publish","format-standard","hentry","category-working-papers-publicacoes"],"_links":{"self":[{"href":"https:\/\/obegef.pt\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/13476","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/obegef.pt\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/obegef.pt\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/obegef.pt\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/obegef.pt\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=13476"}],"version-history":[{"count":5,"href":"https:\/\/obegef.pt\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/13476\/revisions"}],"predecessor-version":[{"id":13482,"href":"https:\/\/obegef.pt\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/13476\/revisions\/13482"}],"wp:attachment":[{"href":"https:\/\/obegef.pt\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=13476"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/obegef.pt\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=13476"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/obegef.pt\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=13476"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}