Sumários
Síntese de fala
30 Abril 2026, 16:00 • Eugénio Ribeiro
Normalização de texto.
Conversão grafema‐fone.
Prosódia.
Principais abordagens.
Reconhecimento de fala
30 Abril 2026, 14:30 • Eugénio Ribeiro
Arquitetura de um sistema de reconhecimento de fala.
Extração de features.
Modelos de língua.
Avaliação.
Aplicações de processamento automático de fala
23 Abril 2026, 16:00 • Eugénio Ribeiro
Exemplos de aplicações de processamento automático de fala
Introdução ao processamento automático de língua falada
23 Abril 2026, 14:30 • Eugénio Ribeiro
Fonética e processamento de sinal.
Processamento computacional da língua escrita: apresentação por um convidado
16 Abril 2026, 16:00 • Ricardo Daniel Santos Faro Marques Ribeiro
[ Abstract ]
STRING is a hybrid Natural Language Processing (NLP) system for Portuguese that combines rule-based linguistic knowledge with statistical methods within a modular processing architecture. It supports the full pipeline of core NLP tasks, including tokenization and sentence segmentation, part-of-speech tagging, morphosyntactic disambiguation, shallow parsing (chunking), and deep syntactic parsing through dependency extraction.
In this presentation, we illustrate how STRING handles linguistically complex phenomena by leveraging rich lexical resources and grammar-based constraints. Particular emphasis is placed on its ability to model constructions that remain challenging for purely data-driven systems.
We also discuss ongoing and future developments aimed at extending the system’s coverage and robustness. These include tighter integration with off-the-shelf NLP tools (e.g., spaCy), the incorporation of semantic parsing frameworks such as Lexicalized Meaning Representation, and the expansion of the underlying grammar to better capture support-verb constructions with predicative nouns, as well as a wide range of adverbial constructions.
Ultimately, STRING aims to provide a linguistically grounded, interpretable, and high-performance parsing system, contributing to the development of robust and explainable NLP technologies for Portuguese.
[ Bio ]
Jorge Baptista is Associate Professor with "Agregação" in Language Sciences at the University of Algarve, Portugal, and Senior Researcher at the Human Language Technologies Laboratory (INESC-ID, Lisbon). His work lies at the intersection of theoretical and computational linguistics, with a focus on natural language processing, syntactic and semantic parsing, named-entity recognition, language complexity, and corpus-based Lexicon-Grammar of Portuguese. His recent research includes NLP-based assessment of writing proficiency, the development of large-scale lexical resources, and the modelling of complex constructions in Portuguese. He has published extensively on the computational treatment of grammar, phraseology, and linguistic complexity, and collaborates internationally on projects that bring together linguistic theory, language technology, and education.