Implementing X-means clustering in Java using WEKA API.


This post shows how to run x-means clustering algorithm in Java using Weka.

Prepare your data properly and use the following code to run x-means clustering algorithm. The output is the instance and their corresponding cluster.


import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import weka.clusterers.XMeans;
import weka.core.Instances;

public class Cluster
{
    public static BufferedReader readDataFile(String filename)
    {
        BufferedReader inputReader = null;
        try
        {
            inputReader = new BufferedReader(new FileReader(filename));
        }
        catch (FileNotFoundException ex)
        {
            System.err.println("File not found: " + filename);
        }
        return inputReader;
    }

    public static void main(String[] args) throws Exception
    {
        XMeans xmean = new XMeans();
        xmean.setSeed(10);
        xmean.setMaxNumClusters(2);

        BufferedReader datafile = readDataFile("PAPERS.arff");
        Instances data = new Instances(datafile);
 
        xmean.buildClusterer(data);

        for(int i=0;i<data.numInstances();i++)
        {
            System.out.printf("Instance %d -> Cluster %d \n", i, xmean.clusterInstance(data.instance(i)));
        }
    }
}



Output:

Instance 0 -> Cluster 0
Instance 1 -> Cluster 0
Instance 2 -> Cluster 1
Instance 3 -> Cluster 1
Instance 4 -> Cluster 0
Instance 5 -> Cluster 0
Instance 6 -> Cluster 0
Instance 7 -> Cluster 0
Instance 8 -> Cluster 0  

. . .

Comments

  1. May I have the input dataset?
    Thanks!

    ReplyDelete
    Replies
    1. I have been clear about this format.
      Would you please help if I want to define my own distance measure between 2 instances?

      Delete
  2. This comment has been removed by the author.

    ReplyDelete

Post a Comment

Popular posts from this blog

Text Mining Classification Implementation in Java Using WEKA

Topological Sort of JUNG graph in Java