public class Dataset
extends java.lang.Object
ProjectionMethod
are instances
of this class. Dataset provides methods for working with such sets (e.g. open
dataset up, adding points, checking their integrity, finding nearest
neighbors of a point, calculating their interpoint distances, etc.). It is
assumed that all points in a dataset have the same dimensionality.Modifier and Type | Field and Description |
---|---|
(package private) org.apache.log4j.Logger |
logger
Logger.
|
Constructor and Description |
---|
Dataset(int dimensions)
Creates and instance of Dataset.
|
Dataset(int dimensions,
int numPoints)
Creates an instance of Dataset.
|
Modifier and Type | Method and Description |
---|---|
void |
addPoint(DataPoint point)
Add datapoint without checking whether it is unique or not.
|
DataPoint |
addPoint(DataPoint point,
double tolerance)
Add a new datapoint to the dataset.
|
void |
clear()
Clear all data.
|
void |
exportToCSV(java.io.File theFile)
Save the current datast to a stored file.
|
int |
getClosestIndex(DataPoint point)
Returns the index of the closest point.
|
double |
getComponent(int datapointNumber,
int dimension)
Get a specific coordinate of a specific datapoint.
|
double |
getCovariance(int i,
int j)
Returns the covariance of the ith component of the dataset with respect
to the jth component.
|
Jama.Matrix |
getCovarianceMatrix()
Returns a covariance matrix for the dataset.
|
java.util.ArrayList<DataPoint> |
getDatasetCopy() |
int |
getDimensions() |
double |
getDistance(DataPoint point1,
DataPoint point2)
Returns the euclidean distance between two points.
|
double |
getDistance(int index1,
int index2)
Get the distance between two points.
|
double[][] |
getDistances()
Returns a matrix of interpoint distances, between the points in the
dataset.
|
java.lang.String[][] |
getDoubleStrings()
Returns a matrix of strings, one row for each datapoint, representing the
dataset.
|
int[] |
getKNearestNeighbors(int k,
DataPoint point)
returns k neighbors where the 0th item is the closest and the 1st item is
the second closest etc.
|
int |
getKthVariantDimension(int k)
Returns the k'th most variant dimesion.
|
DataPoint |
getLastAddedPoint()
Returns the last point added to this dataset.
|
double |
getMaximumDistance()
Get the maximimum interpoint distance between points in the dataset.
|
double |
getMean(int d)
Returns the mean of the dataset on a given dimension.
|
double |
getMinimumDistance()
Get the minimum interpoint distance between points in the dataset.
|
int |
getNumPoints() |
DataPoint |
getPoint(int i)
Get a specified point in the dataset.
|
double |
getSumDistances() |
void |
mirror(Dataset other)
Makes this dataset a copy of the passed in dataset.
|
void |
perturbOverlappingPoints(double factor)
Find repeated points and perturb them slightly so they don't overlap.
|
void |
postOpenInit()
Initializes Dataset from persistent data.
|
void |
preSaveInit()
Initializes persistent data.
|
void |
printDataset()
Print out all points in the dataset Useful for debugging.
|
void |
randomize(int upperBound)
Randomize dataset to a value between 0 and upperBound.
|
void |
resultsToMaple(java.io.PrintStream ps)
Print out low dimensional points so maple can plot them Just does low
dimension = 2.
|
void |
setPoint(int i,
DataPoint point)
Set a specified point in the dataset.
|
java.lang.String |
toString() |
public Dataset(int dimensions)
dimensions
- dimension of datasetpublic Dataset(int dimensions, int numPoints)
dimensions
- dimension of datasetnumPoints
- number of pointspublic DataPoint getPoint(int i)
i
- index of the point to getpublic DataPoint addPoint(DataPoint point, double tolerance)
point
- A point in the high dimensional spacetolerance
- forwarded to isUniquePoint; if -1 then add point
regardless of whether it is unique or notpublic void addPoint(DataPoint point)
point
- point to be addedpublic void setPoint(int i, DataPoint point)
i
- the point to setpoint
- the new n-dimensional pointpublic int getNumPoints()
public void clear()
public void randomize(int upperBound)
upperBound
- highest value to be usedpublic double getMinimumDistance()
public double getMaximumDistance()
public void exportToCSV(java.io.File theFile)
theFile
- the file where data should be savedpublic void perturbOverlappingPoints(double factor)
factor
- Distance to perturbpublic void resultsToMaple(java.io.PrintStream ps)
ps
- public double getComponent(int datapointNumber, int dimension)
datapointNumber
- index of the point to getdimension
- dimension of the desired componentpublic int getClosestIndex(DataPoint point)
point
- the point to checkpublic int[] getKNearestNeighbors(int k, DataPoint point)
k
- the number of points to retrievepoint
- the point to find neighbors forpublic double getDistance(int index1, int index2)
index1
- index of point 1index2
- index of point 2public double getDistance(DataPoint point1, DataPoint point2)
point1
- First point of distancepoint2
- Second point of distancepublic int getDimensions()
public double[][] getDistances()
public double getSumDistances()
public double getMean(int d)
d
- index of the dimension whose mean to getpublic double getCovariance(int i, int j)
i
- first dimensionj
- second dimensionpublic Jama.Matrix getCovarianceMatrix()
public int getKthVariantDimension(int k)
k
- Number of variant dimensionpublic java.util.ArrayList<DataPoint> getDatasetCopy()
public void mirror(Dataset other)
other
- the the datasetpublic void printDataset()
public java.lang.String[][] getDoubleStrings()
public void preSaveInit()
public void postOpenInit()
public java.lang.String toString()
toString
in class java.lang.Object
public DataPoint getLastAddedPoint()