In machine learning, feature vectors are used to represent numeric or symbolic characteristics, called features, of an object in a mathematical, easily analyzable way. They are important for many different areas of machine learning and pattern processing. Machine learning algorithms typically require a numerical representation of objects in order for the algorithms to do processing and statistical analysis. Feature vectors are the equivalent of vectors of explanatory variables that are used in statistical procedures such as linear regression.
An example of a feature vector you might be familiar with is RGB (red-green-blue) color descriptions. A color can be described by how much red, blue, and green there is in it. A feature vector for this would be color = [R, G, B].
A vector is a series of numbers, like a matrix with one column but multiple rows, that can often be represented spatially. A feature is a numerical or symbolic property of an aspect of an object. A feature vector is a vector containing multiple elements about an object. Putting feature vectors for objects together can make up a feature space.
The features may represent, as a whole, one mere pixel or an entire image. The granularity depends on what someone is trying to learn or represent about the object. You could describe a 3-dimensional shape with a feature vector indicating its height, width, depth, etc.
Feature vectors are used widely in machine learning because of the effectiveness and practicality of representing objects in a numerical way to help with many kinds of analyses. They are good for analysis because there are many techniques for comparing feature vectors. One simple way to compare the feature vectors of two objects is to take the Euclidean distance.
In image processing, features can be gradient magnitude, color, grayscale intensity, edges, areas, and more. Feature vectors are particularly popular for analyses in image processing because of the convenient way attributes about an image, like the examples listed, can be compared numerically once put into feature vectors.
In speech recognition, features can be sound lengths, noise level, noise ratios, and more.
In spam-fighting initiatives, features are abundant. They can be IP location, text structure, frequency of certain words, or certain email headers.
Types of feature vectors
Feature vectors here are sentences in a larger vector space:
In pattern recognition processes, feature vectors are the tools used between gathering data, and making sense of the data:
- , S. File:RGB color solid cube.png. Retrieved July 20, 2016, from https://commons.wikimedia.org/wiki/File:RGB_color_solid_cube.png
- Gutierrez-Osuna, R. Introduction to Pattern Analysis. Retrieved April 30, 2016, from http://research.cs.tamu.edu/prism/lectures/iss/iss_l9.pdf
- Perone, C. Machine Learning: text feature extraction. Retrieved from http://blog.christianperone.com/2011/09/machine-learning-text-feature-extraction-tf-idf-part-i/