Clair::Utils

Parse


SummaryPackage variablesSynopsisGeneral documentationMethods

SummaryTop
Parse - A wrapper around two common parsing tools: the Charniak parser and
chunklink tool.

Package variablesTop
No package variables defined.

Included modulesTop
Clair::Config
Clair::Document

SynopsisTop
This module wraps two common parsing tools: the Charniak parser and chunklink tool.
It provides a simple interface for using the tools.

DescriptionTop
No description!
MethodsTop
chunklinkDescriptionCode
forclNo descriptionCode
parseDescriptionCode
prepare_for_parseDescriptionCode

Methods description


chunklinkcode    nextTop
my $chunk_output = Clair::Utils::Parse::chunklink("WSJ_0021.MRG", output_file => "output.txt");
The chunklink method runs a file through the chunklink tool, returning the result
as a string, and optionally saving it to an output file.

parsecodeprevnextTop
my $parse_output = Clair::Utils::Parse::parse("to_be_parsed.txt", output_file => "output.txt");
The parse method runs a file through the Charniak parser, returning the result
as a string, and optionally saving it to an output file.

prepare_for_parsecodeprevnextTop
Clair::Utils::Parse::prepare_for_parse("input.txt", "output.txt");
Prepare for parse creates a file prepared for being run through the Charniak parser.
It splits a file into sentences and places each sentence on its own line, inside
<s></s> tags.

Methods code


chunklinkdescriptionprevnextTop
sub chunklink {
	my $filename = shift;

	my %args = @_;

	my $chunk_path = (defined $args{path} ? $args{path} : $CHUNKLINK_PATH);
	my $output_file = (defined $args{output_file} ? $args{output_file} : "");
	my $options = (defined $args{options} ? $args{options} : "");

	my $result = `$chunk_path $options $filename 2> /dev/null`;
	
	if ($output_file ne "") {
		open OUT, "> $output_file";
		print OUT $result;
		close OUT;
	}

	return $result;
}

forcldescriptionprevnextTop
sub forcl {
	my $filename = shift;

	my %args = @_;

	my $chunk_path = (defined $args{path} ? $args{path} : $CHUNKLINK_PATH);
	my $output_file = (defined $args{output_file} ? $args{output_file} : "");

	my $result = "";

	open FILENAME, $filename;
	while (<FILENAME>) {
	    s/\(S1 \(/\( \(/;
	    $result .= $_;
	}
	
	if ($output_file ne "") {
		open OUT, "> $output_file";
		print OUT $result;
		close OUT;
	}

	return $result;
}

parsedescriptionprevnextTop
sub parse {
	my $filename = shift;

	my %args = @_;

	my $output_file = (defined $args{output_file} ? $args{output_file} : "");
	my $char_path = (defined $args{path} ? $args{path} : $CHARNIAK_PATH);
	my $char_data_path = (defined $args{data_path} ? $args{data_path} : $CHARNIAK_DATA_PATH);
	my $options = (defined $args{options} ? $args{options} : "");

	my $result = `$char_path $options $char_data_path $filename`;

	if ($output_file ne "") {
		open OUT, "> $output_file";
		print OUT $result;
		close OUT;
	}

	return $result;
}

prepare_for_parsedescriptionprevnextTop
sub prepare_for_parse {
	my $filename = shift;
	my $outfile = shift;

	my $doc = new Clair::Document(file => $filename, id => 'parse_doc', type => 'text');

	my @sentences = $doc->split_into_sentences();

	open OUT, "> $outfile";

	foreach my $sent (@sentences) {
		print OUT "<s> $sent </s>\n";
	}

	close OUT;
}

General documentation


VERSIONTop
Version 0.01

AUTHORTop
Hodges, Mark << <clair at umich.edu> >>
Radev, Dragomir << <radev at umich.edu> >>

BUGSTop
Please report any bugs or feature requests to
bug-clair-document at rt.cpan.org, or through the web interface at
http://rt.cpan.org/NoAuth/ReportBug.html?Queue=clairlib-dev.
I will be notified, and then you will automatically be notified of progress on
your bug as I make changes.

SUPPORTTop
You can find documentation for this module with the perldoc command.
    perldoc Stem
You can also look for information at:

    * AnnoCPAN: Annotated CPAN documentation

http://annocpan.org/dist/clairlib-dev

    * CPAN Ratings

http://cpanratings.perl.org/d/clairlib-dev

    * RT: CPAN's request tracker

http://rt.cpan.org/NoAuth/Bugs.html?Dist=clairlib-dev

    * Search CPAN

http://search.cpan.org/dist/clairlib-dev

COPYRIGHT & LICENSETop
Copyright 2006 The University of Michigan, all rights reserved.
This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.