Parse Analysis and Generation of Turkish Words  v1.0
Gram tests on Turkish word morphology and generating words from parsed words using foma and TRmorph.
Functions
/home/emre/Desktop/nlp/hw01/main.cpp File Reference

Gebze Institute of Tech. CSE 484 Introduction to NLP HW01 Gram tests on Turkish word morphology and generating words from parsed words using foma and TRmorph. To use program you have to download and install foma. More...

#include <cstdio>
#include <iostream>
#include <cstdlib>
#include <vector>
#include <string>
#include <sstream>
#include <queue>
#include <iomanip>
#include <fstream>
#include "ngram.hpp"

Functions

void initWords (char *fileName, vector< string > &vect)
bool writeGeneratorFileScript (char *fileName, string str)
bool editWordFile (char *fileName)
void printLoading (int loading)
void runScriptForEveryWord (vector< string > &vect, string argv2)
void ngramParserFromFile (NGram ngram[])
void wordGenerator (NGram ngram[], string fst, int oneG, int twoG, int threeG)
void printResults (NGram ngram[], string fileName)
void printNSize (NGram ngram[], int n)
int main (int argc, char *argv[])

Detailed Description

Gebze Institute of Tech. CSE 484 Introduction to NLP HW01 Gram tests on Turkish word morphology and generating words from parsed words using foma and TRmorph. To use program you have to download and install foma.

Main Side of Program.

Author:
Emre Sercan Aslan <emre.aslan@e-aslan.net>
Date:
March 24, 2013
Version:
v1.0
See also:
References :

[1] Mans Hulden, Finite-State Compiler and C Library http://code.google.com/p/foma/

[2] Çağrı Çöltekin, TRmorph: A Turkish Morphological Analyzer https://github.com/coltekin/TRmorph


Function Documentation

bool editWordFile ( char *  fileName)

Opens and edits created word file. It is used for editing generated words' list.

Parameters:
fileNameName of source file
Returns:
Returns true if process done correctly, otherwise false
void initWords ( char *  fileName,
vector< string > &  vect 
)

Reads words from text to initialize word list

Parameters:
filenameName of source file
vectWord list vector
int main ( int  argc,
char *  argv[] 
)

Main routine of program.

*use case

*read file get words

*parse words

*analyse parsed parts

*generate words

void ngramParserFromFile ( NGram  ngram[])

Analyses morphologies of words according to 1-Gram, 2-Gram and 3-Gram.

Parameters:
ngramNGram instances to get 1-Gram, 2-Gram, 3-Gram results
See also:
runScriptForEveryWord()
void printLoading ( int  loading)

Prints loading state to terminal.

Parameters:
loadingPercent of jobs done
void printNSize ( NGram  ngram[],
int  n 
)

Prints just n number of results for per gram test.

Parameters:
ngramNGram instances 1-Gram, 2-Gram, 3-Gram
nNumber of POS Tag will be printed to terminal for per gram
void printResults ( NGram  ngram[],
string  fileName 
)

Writes whole analyse result to file.

Parameters:
ngramNGram instances 1-Gram, 2-Gram, 3-Gram
fileNameName of file where results will be kept
void runScriptForEveryWord ( vector< string > &  vect,
string  argv2 
)

Runs flookup script for per word in word list. Writes parsing result to a file which is used in ngramParserFromFile function.

Parameters:
vectWord list vector
argv2File name like "trmorph.fst"
See also:
ngramParserFromFile() function
void wordGenerator ( NGram  ngram[],
string  fst,
int  oneG,
int  twoG,
int  threeG 
)

Generates word generator codes and writes them to file to generate words. Function generates words using that codes. Runs foma script to generate words. Edits generated word file to delete inappropriate datas.

Parameters:
ngramNGram instances 1-Gram, 2-Gram, 3-Gram
fstFile name like "trmorph.fst"
oneGWhile generating word, use top n number POS Tags in 1Gram
twoGWhile generating word, use top n number POS Tags in 2Gram
threeGWhile generating word, use top n number POS Tags in 3Gram
bool writeGeneratorFileScript ( char *  fileName,
string  str 
)

Writes string to file. Writes root + tags to generate new words

Parameters:
fileNameName of output file
strContent that will be written to output file
Returns:
Returns true if process done correctly, otherwise false
 All Classes Files Functions Friends