Clair::Network

CFNetwork


SummaryIncluded librariesPackage variablesSynopsisGeneral documentationMethods

SummaryTop
Clair::Network::CFNetwork

Package variablesTop
No package variables defined.

Included modulesTop
Clair::Network
Clair::Network::Reader::Pajek
Clair::Network::Writer::PajekProject

InheritTop
Clair::Network

SynopsisTop
CFNetwork is a subclass of the Clair::Network. It includes the extra functionality needed to perform community finding within weighted, undirected networks. The main community finding algorithm is an implementation of the algorithm described in: A. Clauset, M. E. J. Newman and C. Moore, "Finding Community Structure in Very Large Networks," Phys. Rev. E. 70, 066111 (2004).

DescriptionTop
No description!
MethodsTop
_initializeDescriptionCode
_joinDescriptionCode
_printInternalsDescriptionCode
add_edgeDescriptionCode
add_weighted_edgeDescriptionCode
buildFromMatrixFileDescriptionCode
calc_kDescriptionCode
communityFindDescriptionCode
export2PajekProjectDescriptionCode
getConnectedComponentDescriptionCode
getMDescriptionCode
isConnectedDescriptionCode
newDescriptionCode
numConnectedComponentsDescriptionCode
printEdgeWeightsDescriptionCode
printNodeLabelsDescriptionCode
printNodeValuesDescriptionCode
readFromPajekDescriptionCode
remove_edgeDescriptionCode
remove_nodeDescriptionCode

Methods description


_initializecode    nextTop
    ($QR,$aR,$HR,$HM) = $cfn->initialize;
Internal subroutine; there is no need for a user to call this routine, as it is only needed within the community finding algorithm. Initializes modularity table for community finding algorithm. Also calculates supporting information.
$QR is a reference to a hash of refernences to hashes; basically, each hash stored in $QR contains the positive Delta Q_i_j values of the modularity table.
$aR is a hash reference; it stores the a_i values for each node.
$HR is a hash reference; $HR->{$i} is the node index for which delta Q is highest, given $i.
$HM is an array reference; the first element is the highest delta Q value in the table. The second element and third elements are the ($i,$j) location of that value.

_joincodeprevnextTop
    $self->_join($i,$j,$QR,$aR,$HR,$HM);
The procedure that joins two communities; here, $i is absorbed by $j. Also updates $QR, $aR, $HR, and $HM to reflect the change. This procedure should only be used internally, within communityFind().

_printInternalscodeprevnextTop
    $self->_printInternals($QR,$aR,$HR,$HM);
This is a debugging procedure that prints to standard output the contents of the given variables in a somewhat-readable fashion. It should not be used.

add_edgecodeprevnextTop
    $cfn->add_edge($u,$v)
Adds a new edge to the network, but gives a default weight of 1. Prevents creation of self-edges.

add_weighted_edgecodeprevnextTop
    $cfn->add_weighted_edge($u,$v,$weight)
Adds a new edge to the network, with the given weight. Prevents creation of self-edges. If the edge specified already exists in the network, its new weight is the sum of the old and $weight. However, if this new weight is negative, the edge is removed from the network completely. In this way, a network of non-negative, non-self edges can be maintained, and the weights can be easily incremented/decremented with this function. This function returns the new weight of the edge, or 0 if the new weight would be negative or if the edge specified was a self-edge. This routine also updates the m value associated with this network (that is, the sum of the weights of all edges in the network).

buildFromMatrixFilecodeprevnextTop
    $cfn->buildFromMatrixFile($filename,$ignoreweights)
This routine will read a symmetric adjacency matrix from a file and add those edges to the existing graph, creating nodes where necessary. If an edge between two nodes already exists, the weight in the matrix file will be added to the edge weight.
If $ignoreweights is set to true, than the actual added weight for every edge will be 1, regardless of the matrix values. This allows the construction of an idential graph with uniform (1) weights.
The matrix file should contain only the adjacency matrix, 1 row per line, with elements separated with whitespace. The matrix is assumed to be symmetric.

calc_kcodeprevnextTop
    $k = $cfn->calc_k($u)
For a given node, returns the sum of the weights of all the edges connected to it.

communityFindcodeprevnextTop
    $cfn->communityFind()
$cfn->communityFind(skip_connection_test => 0);
$cfn->communityFind(dirname => $dirname)
The community finding algorithm. Writes the series of joins to the "$cfn->{name}.joins"; writes the optimal node labeling to the "$cfn->{name}.bestComm". The format of the .joins file is  (x y c Q) for each line, where:
    x = (label of community that grows)
y = (label of community that is absorbed)
c = (number of communities remaining)
Q = (Q value after the join).
The format of .bestComm file is one node and its label per line.
This function also produces a file "$cfn->{name}.mljoins", which is a version of the .joins file that has been formatted so that it can be plotted using the Matlab dendrogram() function. Since this Matlab function fails if the node ids are non-sequential or contain an id of zero, new node ids are needed; therefore, the node ids in this file will be different from the ids in the .joins file and in the source data. The dendrogram can be plotted in Matlab using the supplied plotDendro.m file; an appropriate labelfile for the Matlab function is also generated by communityFind(), by calling printNodeLabels().
This routine calls _initialize, _join, and (for debugging) _printInternals.
Note that communityFind() expects that the network is connected, and, by default, will abort with an error message otherwise. To determine whether the network is connected, the function calls the Graph::is_connected() function prior to doing anything. However, for large graphs, this can be an expensive operation. This check can be turned off by setting the parameter variable skip_connection_test to 1 -- in this case, the check will not be performed, and the algorithm will continue forward. Only turn off the connection test if you know the network is fully connected and wish to avoid the overhead of checking again; otherwise errors will occur.
To find communities within the connected components of the network, use getConnectedComponent to generate sub-CFNetworks from those connected components, and then run communityFind() on those sub-CFNetworks.
This function returns the value of the highest Q value achieved during the community merging, or 0 if the input graph was non-connected.
If dirname is given, this will be location the files are written to. If it is not given, the files will be written to the current working directory.

export2PajekProjectcodeprevnextTop
    $self->export2PajekProject(partition => $partition)
$self->export2PajekProject(partition => $partition, dirname => $dirname)
Creates a Pajek-formatted file representing the network. $partition
can either be "best", in which case the Pajek file will include a
partition that labels vertices according to the .bestComm file, or a
number. If $partition is a number, the Pajek file will include a
partition that divides the network into that many communities.
$dirname is the location of the .bestComm and .join files, as well as
the location to which the output file will be written. If $dirname
isn't given, the location to read and write from is the location the
current working directory.

getConnectedComponentcodeprevnextTop
    my $subCFNetwork = $cfn->getConnectedComponent($num)
This routine builds a new CFNetwork, consisting of the nodes and edges of the $num-th largest connected component of the original CFNetwork. For example, if $num == 2, the output will be a reference to a new CFNetwork object, with nodes identical to those in the 2nd largest connected component of the original, and all their internal edges.
The labels and values of nodes are perserved, as are the edge weights. The name associated with the new CFNetwork is the old name, suffixed with ".$num".
If $num is greater than the number of connected components, the returned network will consist of the smallest connected component of the original. If $num is less than 1, the returned network will be the largest connected component of the original.
This function allows large, non-connected networks to be broken into pieces that can be analyzed by communityFind(). Since the components are not connected, nodes in different components cannot be part of the same community structure.

getMcodeprevnextTop
    $m = $cfn->getM
Returns the sum of all the edge weights in the graph.

isConnectedcodeprevnextTop
    if ( $cfn->isConnected ) { ... }
Returns true if the CFNetwork is connected, false otherwise. May take some time on larger networks.

newcodeprevnextTop
    my $cfn = Clair::Network::CFNetwork->new()
my $cfn = Clair::Network::CFNetwork->new(name => "Network_Name")
Constructor for CFNetwork class objects. Creates an empty network. If name is specified, it will be used as the network name; the default name is "CFNetwork". The network name will be used as a prefix for certain output files; ideally it should not include spaces.

numConnectedComponentscodeprevnextTop
    $self->numConnectedComponents
Returns the number of connected components in the network. Can be time consuming for larger networks.

printEdgeWeightscodeprevnextTop
    $cfn->printEdgeWeights($filename);
This prints to filename a list of all the edges and their weights in the graph, one per line. The format is "source target weight". The weights are sorted by the source node id.

printNodeLabelscodeprevnextTop
    $cfn->printNodeLabels($filename);
This prints to filename a list of all the node labels in the graph, one per line. The labels are sorted by node id. If a node has no label, its label is listed as "undef".
This routine is useful when feeding node labels to a dendrogram drawing function.

printNodeValuescodeprevnextTop
    $cfn->printNodeValues($filename);
This prints to filename a list of all the node values in the graph, one per line. The values are sorted by node id. If a node has no value, its value is listed as "undef".

readFromPajekcodeprevnextTop
read pajek's .net file in and build the network
$network->readFromPajek("/data0/projects/cltest/CFNetwork/SampsonL.net");

remove_edgecodeprevnextTop
    $cfn->remove_edge($u,$v)
This is essentially the same as the Network::remove_edge function, except that it updates the value of m associated with this network in addition to deleting the edge.  This is necessary in order to maintain a valid m value in the presence of edge deletions. A running m tally is faster and easier then trying to calculate m everywhere it is needed, especially for larger networks.

remove_nodecodeprevnextTop
    $cfn->remove_node($id)
This is essentially the same as the Network::remove_node function, except that it updates the value of m associated with this network in addition to deleting the node.  This is necessary in order to maintain a valid m value in the presence of node deletions. A running m tally is faster and easier then trying to calculate m everywhere it is needed, especially for larger networks.

Methods code


_initializedescriptionprevnextTop
sub _initialize {
	my $self = shift;
	my $com = shift;

	my %deltaQ;
	my %aHash;
	my %HHash;
	my @HMAX;
	my $m = $self->getM;

	my $constant = 1/(2*$m);
my ($i, $j, $q); my @nodes = keys %{$com}; foreach( @nodes ) { $aHash{$_} = $constant * $self->calc_k($_); } $HMAX[0] = -1e50; $HMAX[1] = -1; $HMAX[2] = -1; foreach $i ( sort {$a <=> $b} @nodes ) { my %h; foreach ( $self->{graph}->edges_at($i) ) { #print "INIT: Examining edge: ${$_}[0] - ${$_}[1]\n";
if( ${$_}[0] eq $i ) { $j = ${$_}[1]; } else { $j = ${$_}[0] } $q = 2 * ( $constant * $self->{graph}->get_edge_weight($i,$j) - $aHash{$i} * $aHash{$j} ); $h{$j} = $q; if( ! exists $HHash{$i} ) { $HHash{$i} = $j; } elsif( $q > $h{ $HHash{$i} } ) { $HHash{$i} = $j; } if( $q > $HMAX[0] ) { $HMAX[0] = $q; $HMAX[1] = $i; $HMAX[2] = $j; #print "INIT: HMax now ($i,$j)\n";
} } $deltaQ{$i} =\% h; } return (\%deltaQ,\% aHash,\% HHash,\@ HMAX);
}

_joindescriptionprevnextTop
sub _join {
	my $self = shift;
	my ($i,$j,$QR,$aR,$HR,$HM) = @_;

	my @others = ();

## Find remaining communities other than $i, $j
foreach ( sort keys %{$QR} ) { if( $_ ne $i && $_ ne $j ) { push @others, $_; } } my $k; my $hasI; my $hasJ; my %changed; foreach $k ( @others ) { $hasI = 0; $hasJ = 0; if( exists $QR->{$k}->{$i} ) { $hasI = 1; } if( exists $QR->{$k}->{$j} ) { $hasJ = 1; } ## Adjust Row J & Row K
if( $hasI && $hasJ ) { $QR->{$j}->{$k} += $QR->{$i}->{$k}; $QR->{$k}->{$j} += $QR->{$k}->{$i}; $changed{$j} = 1; $changed{$k} = 1; } elsif( $hasI ) { $QR->{$j}->{$k} = $QR->{$i}->{$k} - 2 * $aR->{$j} * $aR->{$k}; $QR->{$k}->{$j} = $QR->{$k}->{$i} - 2 * $aR->{$k} * $aR->{$j}; $changed{$j} = 1; $changed{$k} = 1; } elsif( $hasJ ) { $QR->{$j}->{$k} = $QR->{$j}->{$k} - 2 * $aR->{$i} * $aR->{$k}; $QR->{$k}->{$j} = $QR->{$k}->{$j} - 2 * $aR->{$k} * $aR->{$i}; $changed{$j} = 1; $changed{$k} = 1; } } ## Adjust aHash
$aR->{$j} += $aR->{$i}; delete $aR->{$i}; ## Remove Row i, Column i
delete $QR->{$i}; foreach ( keys %{$QR} ) { delete $QR->{$_}->{$i}; } delete $HR->{$i}; if ( scalar( keys %{$QR} ) == 1 ) { ## Only one community left
$HR->{$j} = "undef"; ${$HM}[0] = "undef"; ${$HM}[1] = "undef"; ${$HM}[2] = "undef"; return; } ## Find new $HR->{$j}'s
foreach $k ( sort {$a <=> $b} keys %changed ) { my $max = -1e50; foreach( keys %{$QR->{$k}} ) { if( $QR->{$k}->{$_} > $max ) { $max = $QR->{$k}->{$_}; $HR->{$k} = $_; } } } ## Find new $HMAX
${$HM}[0] = $QR->{$j}->{ $HR->{$j} }; ${$HM}[1] = $j; ${$HM}[2] = $HR->{$j}; my $q; foreach $k ( sort {$a <=> $b} keys %{$HR} ) { if( $k ne $j ) { $q = $QR->{$k}->{ $HR->{$k} }; #print "k = $k j = $j HR[k] = ";
#print $HR->{$k}, " q = $q Max = ", ${$HM}[0], "\n";
if( $q > ${$HM}[0] ) { ${$HM}[0] = $q; ${$HM}[1] = $k; ${$HM}[2] = $HR->{$k}; } } } # Now ${$HM}[0] holds the highest delta Q value remaining
}

_printInternalsdescriptionprevnextTop
sub _printInternals {
	my $self = shift;
	my $QR = shift;
	my $aR = shift;
	my $HR = shift;
	my $HM = shift;

	my ($i,$j);

	foreach $i ( sort {$a <=> $b} keys %{$aR} ) {
		print "Community $i:  a = ", $aR->{$i};
		print "  Maximum DQ at $HR->{$i}\n";
	}

	foreach $i (sort {$a <=> $b} keys %{$QR} ) {

		print " DQ($i) : ";
		print "Num. entries = ",scalar( keys %{$QR->{$i}} ), " : ";
		foreach $j ( sort keys %{ $QR->{$i} } ) {

			print " $j:";
			print $QR->{$i}->{$j}, "  ";
		}
		print "\n";
	}

	print "Maximum Delta Q = ", ${$HM}[0];
	print "  Located at = (", ${$HM}[1], ",", ${$HM}[2], ")\n";
}

add_edgedescriptionprevnextTop
sub add_edge {
    my $self = shift;
    my $u = shift;
    my $v = shift;

    ## Assume a weight of 1
return $self->add_weighted_edge($u, $v, 1);
}

add_weighted_edgedescriptionprevnextTop
sub add_weighted_edge {
    my $self = shift;
    my $u = shift;
    my $v = shift;
    my $weight = shift;
    my $graph = $self->{graph};

    if( $u ne $v ) {

        if( $graph->has_edge($u,$v) ) {

            my $orig = $graph->get_edge_weight($u,$v);
            $weight = $weight + $orig;
            if( $weight < 0 ) {
                $graph->delete_edge($u,$v);
                $weight = 0;
                $self->{mfact} = $self->{mfact} - $orig;
            }
            else {
                $graph->set_edge_weight($u,$v,$weight);
                $self->{mfact} = $self->{mfact} + $weight - $orig;
            }
        }
        elsif ($weight >= 0 ) {
            $graph->add_weighted_edge($u, $v, $weight);
            $self->{mfact} = $self->{mfact} + $weight;
        }
    }
    else { $weight = 0; }

    return $weight;
}

buildFromMatrixFiledescriptionprevnextTop
sub buildFromMatrixFile {
	my $self = shift;
	my $file = shift;
	my $ignoreweights = shift;


#  Will read an Adjacency matrix from a file and add those edges to
# the existing graph, creating nodes where necessary. If an edge
# between two nodes already exists, the weight in the matrix file
# will be added to the edge weight.
# Assumes the matrix is symmetric.
# If the matrix file includes weights and $ignoreweights is set to 1, then all those
# weights will be discarded; the edges will have a weight equal to their previous
# weight + 1.
open(MAT, "<", $file) or die "Clair::Network::CFNetwork:buildFromMatricFile: Cannot open file $file\n"; print "\n**** READING MATRIX FILE $file ****\n"; my $line; my @w; my $num; my ($i,$j); $line = <MAT>; $line =~ s/^\s+|\s+$//g; @w = split(/\s+/, $line); $num = scalar(@w); # foreach( @w ) { print "|$_| "; }
# print "\n";
if( $num > $self->num_nodes) { for($i = $self->num_nodes + 1; $i<= $num; $i++) { $self->add_node($i,label => "N$i"); print " Created node $i (N$i)\n"; } } $i = 1; for( $j = $i+1; $j<= $num; $j++ ){ if ($w[$j-1] != 0) { if( $ignoreweights ) { $w[$j-1] = 1; } $self->add_weighted_edge($i,$j,$w[$j-1]); } } while ( $line = <MAT> ) { $line =~ s/^\s+|\s+$//g; $i++; @w = split(/\s+/, $line); for( $j = $i + 1; $j <= $num; $j++) { if( $ignoreweights ) { $w[$j-1] = 1; } if ($w[$j-1] != 0) { $self->add_weighted_edge($i,$j,$w[$j-1]); } } } close MAT; if( $i != $num ) { print "CFNetwork:buildFromMatrixFile: Warning -- input matrix file is not square.\n"; } print "\n**** DONE READING MATRIX FILE $file ****\n\n";
}

calc_kdescriptionprevnextTop
sub calc_k {
    my $self = shift;
    my $node = shift;
    my $graph = $self->{graph};

    my @e = $graph->edges_at($node);
    my $k = 0;

    foreach ( @e ) {
        #print "EDGE:  ", ${$_}[0], " -- ", ${$_}[1], "\n";
$k += $graph->get_edge_weight( ${$_}[0], ${$_}[1] ); } return $k;
}

communityFinddescriptionprevnextTop
sub communityFind {
	my $self = shift;
	my %parameters = @_;

	if( exists $parameters{skip_connection_test} ) {

		if( $parameters{skip_connection_test} ) {
			print "Skipping internal connection test...assuming network is connected.\n";
		}
		else {
## Check to see if the graph is connected; if not, exit; the algorithm
## can't handle disconnected componets well.
print "Determining if the network is connected...\n"; if( ! $self->{graph}->is_connected ) { print "Clair::Network::CFNetwork::communityFind: The existing network is not connected -- communityFind cannot be run.\n"; return 0; } } } else { ## Check to see if the graph is connected; if not, exit; the algorithm
## can't handle disconnected componets well.
print "Determining if the network is connected...\n"; if( ! $self->{graph}->is_connected ) { print "Clair::Network::CFNetwork::communityFind: The existing network is not connected -- communityFind cannot be run.\n"; return 0; } } my $dirname = ""; if (exists $parameters{dirname}) { if( -d $parameters{dirname}) { $dirname = $parameters{dirname}."/"; } else { print STDERR "\n**Directory does not exist: $parameters{dirname}\n"; print STDERR "**Printing to local directory.\n\n"; $dirname = "./"; } } else { $dirname = "./"; } my $best_outfile = $dirname . $self->{name} . ".bestComm"; my $joins_outfile = $dirname . $self->{name} . ".joins"; my $matlab_outfile = $dirname . $self->{name} . ".mljoins"; my $nlab_file = $dirname . $self->{name} . ".nlab"; # Print the node labels to a file -- useful in conjuction with the Matlab dendrograms
$self->printNodeLabels($nlab_file); my ($i,$j); my $Q = 0; open(JOINS, ">", $joins_outfile); open(MLAB, ">", $matlab_outfile); ## Set up initial community labels
my %communities; my $com =\% communities; my $oldLab; my @nodes = $self->{graph}->vertices; my $commCount = 1; my %matlabNodeIds; # Need new, sequential, non-zero node ids, or else matlab has a fit
# when trying to plot the dendrogram
foreach ( sort {$a <=> $b} @nodes ) { $com->{$_} = "$_"; $matlabNodeIds{$_} = $commCount; $commCount++; } --$commCount; #print "Number of Communities = $commCount\n";
## Initialize modularity
my ($QR,$aR,$HR,$HM) = $self->_initialize($com); ## Initialize the Q value for the current state, where every vertex is its own community
foreach( keys %{$aR} ) { $Q += -1 * $aR->{$_} * $aR->{$_}; } print "Initial Q value = $Q\n"; #$self->_printInternals($QR,$aR,$HR,$HM);
print "\n** Joining Communities **\n"; my $matlab_count = scalar( @nodes ) + 1; my $highestQ = $Q; my $matlab_distance = 0; my $flag = 1; while( $commCount > 1 ) { if( ${$HM}[0] < 0 && $flag == 1) { #We've achieved the best possible division, so write it to file
open(BEST, ">", $best_outfile); foreach( sort {$a <=> $b} keys %{$com} ) { print BEST "$_ $com->{$_}\n"; } $flag = 0; close BEST; } $i = ${$HM}[1]; $j = ${$HM}[2]; $Q += ${$HM}[0]; if( $Q > $highestQ ) { $highestQ = $Q; } # This is just for the matlab-formatted file so that a dendrogram can be drawn
$matlab_distance++; if( scalar( keys %{$QR->{$i}} ) < scalar(keys %{$QR->{$j}}) ) { print " Joining $i into $j... "; $self->_join($i,$j,$QR,$aR,$HR,$HM); $commCount = $commCount - 1; print JOINS "$com->{$j}\t$com->{$i}\t$commCount\t$Q\n"; print MLAB "$matlabNodeIds{$j}\t$matlabNodeIds{$i}\t$matlab_distance\n"; $matlabNodeIds{$j} = $matlab_count; $matlabNodeIds{$i} = $matlab_count; $matlab_count++; $oldLab = $com->{$i}; foreach( keys %{$com} ) { if( $com->{$_} eq $oldLab ) { $com->{$_} = $com->{$j}; } } #print "Joined $i into $j\n";
} else { print " Joining $j into $i... "; $self->_join($j,$i,$QR,$aR,$HR,$HM); $commCount = $commCount - 1; print JOINS "$com->{$i}\t$com->{$j}\t$commCount\t$Q\n"; print MLAB "$matlabNodeIds{$j}\t$matlabNodeIds{$i}\t$matlab_distance\n"; $matlabNodeIds{$j} = $matlab_count; $matlabNodeIds{$i} = $matlab_count; $matlab_count++; $oldLab = $com->{$j}; foreach( keys %{$com} ) { if( $com->{$_} eq $oldLab ) { $com->{$_} = $com->{$i}; } } #print "Joined $j into $i\n";
} print "Q is now $Q\n"; #foreach( sort keys %communities ) {
# print " N$_:$communities{$_} ";
#}
#print "\n";
#if( $commCount <= 31 ) {
# print "\n$commCount comms left\n";
# $self->_printInternals($QR,$aR,$HR,$HM);
#}
#print "\n";
} close JOINS; close MLAB; return $highestQ;
}

export2PajekProjectdescriptionprevnextTop
sub export2PajekProject {
	my $self = shift;
	my %parameters = @_;

	my $partition = $parameters{partition};

	my $export = Clair::Network::Writer::PajekProject->new();
	$export->set_name($self->{name});
	$export->write_network($self, $self->{name}, %parameters);
}

getConnectedComponentdescriptionprevnextTop
sub getConnectedComponent {
	my $self = shift;
	my $ccid = shift;
	my $output;

	my @cc = $self->{graph}->connected_components;
	my $numcc = scalar( @cc );
	print "\n$numcc connected components found.\n";

	if( $numcc < 2 ) {
		$output = $self;
	}
	else {

		if( $ccid > $numcc ) { $ccid = $numcc; }
		elsif( $ccid < 1 ) { $ccid = 1; }

		my $i;
		my %sizes;
		for($i = 0; $i<=$#cc; $i++) {
			$sizes{$i} =  scalar( @{$cc[$i]} );
		}
		my $compIndex;
		$i = 0;
		foreach( sort {$sizes{$b} <=> $sizes{$a}} keys %sizes ) {
			$i++;
			if( $i == $ccid ) {
				$compIndex = $_;
				last;
			}
		}

## Now we know that $cc[$compIndex] is the $ccidth-largest component; make a
## new CFNetwork using only those nodes.
my $name = $self->{name} . "." . $ccid; $output = Clair::Network::CFNetwork->new(name => $name); my @nodes = @{ $cc[$compIndex] }; my $label; my $value; foreach ( @nodes ) { $label = $self->get_vertex_attribute($_,"label"); if( $self->has_vertex_attribute($_,"value") ) { $value = $self->get_vertex_attribute($_,"value"); } else { $value = "undef"; } $output->add_node($_,label => $label, value => $value); #print "Added node $_ ($label)\n";
} my @edgesat; foreach ( @nodes ) { @edgesat = $self->{graph}->edges_at($_); foreach( @edgesat ) { if( ! $output->has_edge( ${$_}[0], ${$_}[1] ) ) { $value = $self->get_edge_weight( ${$_}[0], ${$_}[1] ); $output->add_weighted_edge( ${$_}[0], ${$_}[1], $value ); #print "Added edge ${$_}[0] -- ${$_}[1] (wt = $value)\n";
} } } } return $output;
}

getMdescriptionprevnextTop
sub getM {
    my $self = shift;
    return $self->{mfact};
}

isConnecteddescriptionprevnextTop
sub isConnected {
    my $self = shift;
    return $self->{graph}->is_connected;
}

newdescriptionprevnextTop
sub new {
  my $class = shift;
  my %parameters = @_;
  my $name;
  if( exists $parameters{name} ) {
      $name = $parameters{name}
  }
  else {
      $name = "CFNetwork";
  }

  my $graph = Graph::Undirected->new;

  my $self = bless {
      graph => $graph,  ## All the important network data
name => $name, ## The network name
mfact => 0, ## a running tally of the sum of all edge weights
directed => 0, ## This object is always an undirected graph
}, $class; return $self;
}

numConnectedComponentsdescriptionprevnextTop
sub numConnectedComponents {
	my $self = shift;
	my @cc = $self->{graph}->connected_components;
	return scalar( @cc );
}

printEdgeWeightsdescriptionprevnextTop
sub printEdgeWeights {
	my $self = shift;
	my $file = shift;

	open(OUT, ">", $file) or die "CFNetwork:printEdgeWeights:  Cannot write to specified file $file\n";

	foreach ( sort {${$a}[0] <=> ${$b}[0] } $self->{graph}->edges ) {

		print OUT "${$_}[0] ${$_}[1] ";
		print OUT $self->get_edge_weight(${$_}[0],${$_}[1]), "\n";

	}

	close OUT;
}

printNodeLabelsdescriptionprevnextTop
sub printNodeLabels {
	my $self = shift;
	my $file = shift;

	open(OUT, ">", $file) or die "CFNetwork:printNodeLabels:  Cannot write to specified file $file\n";

	foreach ( sort {$a <=> $b} $self->{graph}->vertices ) {

		if( $self->has_vertex_attribute($_,"label") ) {
			print OUT "$_ ",$self->get_vertex_attribute($_,"label"), "\n";
		}
		else {
			print OUT "undef\n";
		}
	}

	close OUT;
}

printNodeValuesdescriptionprevnextTop
sub printNodeValues {
	my $self = shift;
	my $file = shift;

	open(OUT, ">", $file) or die "CFNetwork:printNodeValues:  Cannot write to specified file $file\n";

	foreach ( sort {$a <=> $b} $self->{graph}->vertices ) {

		if( $self->has_vertex_attribute($_,"value") ) {
			print OUT "$_ ", $self->get_vertex_attribute($_,"value"), "\n";
		}
		else {
			print OUT "undef\n";
		}
	}

	close OUT;
}

readFromPajekdescriptionprevnextTop
sub readFromPajek {
	my $self = shift;
	my $fileName = shift;
	my $reader = Clair::Network::Reader::Pajek->new();

	my $net = $reader->read_network($fileName);
#copy network
my $cnt = 1; my %node_hash = (); #add nodes
foreach my $v ($net->get_vertices) { $self->add_node($cnt, label => $v); $node_hash{$v} = $cnt++; } #add edges
foreach my $e ($net->get_edges) { my ($u, $v) = @$e; $self->add_edge($node_hash{$u}, $node_hash{$v}); }
}

remove_edgedescriptionprevnextTop
sub remove_edge {
    my $self = shift;
    my $u = shift;
    my $v = shift;

    if( $self->has_edge($u,$v) ) {
        ##update m
$self->{mfact} = $self->{mfact} - $self->get_edge_weight($u,$v); $self->{graph}->delete_edge($u,$v); }
}

remove_nodedescriptionprevnextTop
sub remove_node {
    my $self = shift;
    my $id = shift;

    if( $self->has_node($id) ) {
        ## update m Factor
$self->{mfact} = $self->{mfact} - $self->calc_k($id); $self->{graph}->delete_vertex($id); }
}

General documentation


DescriptionTop
CFNetwork is a subclass of the Clair::Network. It includes the extra functionality needed to perform community finding within weighted, undirected networks. The main community finding algorithm is an implementation of the algorithm described in: A. Clauset, M. E. J. Newman and C. Moore, "Finding Community Structure in Very Large Networks," Phys. Rev. E. 70, 066111 (2004).

AUTHORTop
Ryan Roth << rmr48 AT columbia dot edu >>

BUGSTop
None Known.

COPYRIGHT & LICENSETop
Copyright 2007 Ryan Roth, all rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.