|
|
ABOUT ME
NLP Research Associate at Lancaster University. At the School of Computing
and Communications working on the ESRC funded Corporate Financial
Information Environment (CFIE) project.
Previously
Worked as a Developmental Systems and
Data Mining Developer at the UK Data Archive at Essex University.
Education
PhD in Computer Science, Essex University 2012.
MSc in Information Systems, Jordan University 2008.
BSc in Computer Information Systems, Jordan University 2005.
Research Interests
Natural Language Processing (NLP); mainly on multi-document text
summarisation for both Arabic and English, Information Retrieval, Question
Answering, machine translation, text classification, crowd-sourcing,
information extraction and creating NLP resources.
PhD Thesis
Thesis Topic: Multi-document Arabic Text Summarisation.
Candidacy: Research and investigate the field of Arabic Natural Language
Processing for both Single and Multi-Document Text Summarisation and
providing resources and corpora that could help in advancing and push
forward the research on this field, (thesis abstract is available upon
request).
http://serlib0.essex.ac.uk/record=b1807018~S5
Bibtex Reference:
@PHDTHESIS{
AUTHOR= {Mahmoud El-Haj},
TITLE= {{Arabic Multi-document Text Summarisation}},
SCHOOL= {{University of Essex}},
YEAR = {2012},
ADDRESS = {{The Albert Sloman Library: University of Essex}},
PAGES = {165},
BOOKNUMBER = {139000488},
NOTE = {Thesis (Ph.D.), School of Computer Science and Electronic
Engineering, University of Essex, 2012},
URL= {http://serlib0.essex.ac.uk/record=b1807018~S5}
}
Download my
PhD Thesis
Projects
Corporate
Financial Information Environment (CFIE), Lancaster Uni, UK
The project has five primary objectives:
1. To advance research on the lexical properties and narrative aspects of
corporate disclosures by developing a suite of statistical natural language
processing (NLP) tools for analysing firms' narrative communication
practices.
2. To use the methods developed in objective 1 to measure the linguistic
characteristics of key corporate disclosures (both mandatory and voluntary),
to identify determinants of cross-sectional variation in these
characteristics, and to relate these characteristics to disclosure
informativeness. Analysis of the content of interim management statements
will form a specific application of these methods.
3. To apply the methods developed in objective 1 to advance research on the
interactions between corporate voluntary disclosures and accounting quality,
including new work on the joint effects of corporate disclosure and earnings
management practices on share price anticipation of earnings and companies'
standing in published rankings of investor relations quality.
4. To apply NLP scoring methods to UK corporate news stories in the
financial media with the aim of developing a more complete measure of
corporate financial communications quality.
5. To use the methods and insights from objectives 1 to 4 to provide new
evidence on the links between earnings quality, disclosure quality, and cost
of capital.
Team:
Professor Martin Walker, The University of Manchester.
Professor Steven Young, Lancaster University.
Dr Paul Rayson, Lancaster University.
Dr Mahmoud
El-Haj, Lancaster University.
Dr Vasiliki Athanasakou, London School of Economics.
Dr Thomas Schleicher, The University of Manchester.
The objective of this project is to bring HASSET, the leading and
well-respected English language social science thesaurus, into the Linked
Data web. Its aims are twofold: firstly, it will apply SKOS to HASSET, thus
creating SKOS-HASSET, a Linked Open Data product for the use of the wider
social science community; secondly, it will test SKOS-HASSET's automatic
indexing capabilities in relation to survey data resources. The project is
funded by the Joint Information Systems Committee (JISC).
My role is to automatically index the HASSET thesaurus, publications and
questionnaires and evaluate the automatic indexing with other human manual
indexing.
Apply Natural Language Processing tools to connect the thesaurus index terms
with the related terms in the index of the publications and questionnaires
to enhance the retrieving process of these documents.
Updating Digital Preservation
and Systems (DPS) at the UK Data Archive, Essex, UK
The objective of this project is to build applications to help organise and
manage the DPS current systems. The project is funded by the Economic and
Social Research Council (ESRC).
My role is to write PowerShell scripts to enhance the process of organising
the Archive's studies and to manage the process of creating and downloading
the studies zip bundles which requires security and validation check to
ensure that the uploaded studies and zip bundles meet Archive's required
specifications and standards.
Professional Service
-
-
-
Organiser of the
FlatLands 2012 Workshop on Natural Language Processing Research for
postgraduate students at Cambridge, Essex, Open, and Oxford Universities
Friday, 29th June, 2012 at Essex University, Wivenhoe Park, Colchester,
Essex, UK.
Conference Reviewing
Award
Best Paper Award at the
4th LTC
Conference, Poznan, Poland, 2009. The paper was then selected to appear
at the Springer's Lecture Notes in Computer Science.
|

LINKS
Mahmoud El-Haj LinkedIn
Mahmoud El-Haj Twitter
Mahmoud El-Haj ResearchGate
Mahmoud El-Haj
Academia.edu
InfoLab 21
Lancaster University
School of Computing & Communications
_______________________________
CONTACT
Email:
m.el-haj @ lancaster.ac.uk
Tel:
+44(0)15245 10348
Address:
Office: C28
School of
Computing and Communications
InfoLab21
Lancaster University
Bailrigg,
Lancaster
Lancashire,
LA1 4WA
|
|