What is the Gist of a “Metric Space” or a “Metric” (Mathematics)?

“A metric space is an ordered pair (M,d) where M is a set and d is a metric on M.”

“The metric is a function that defines a concept of distance between any two members of the set, which are usually called points. The metric satisfies a few simple properties. Informally:

  • “the distance from a point to itself is zero,
  • “the distance between two distinct points is positive,
  • “the distance from A to B is the same as the distance from B to A, and
  • “the distance from A to B (directly) is less than or equal to the distance from A to B via any third point C.”

“The function d is also called distance function or simply distance. Often, d is omitted [when describing the metric space symbolically,] and one just writes M… if it is clear from the context what metric is used.”

(Quotes selected from “Metric space,” Wikipedia, retrieved 6/4/2020)

.

Disclaimer:

I am not a professional in this field, nor do I claim to know all of the jargon that is typically used in this field. I am not summarizing my sources; I simply read from a variety of websites until I feel like I understand enough about a topic to move on to what I actually wanted to learn. If I am inaccurate in what I say or you know a better, simpler way to explain a concept, I would be happy to hear from you :).

What is the gist of an “ill-conditioned matrix”?

“If the condition number [of a matrix] is not too much larger than one, the matrix is well-conditioned, which means that its inverse can be computed with good accuracy. If the condition number is very large, then the matrix is said to be ill-conditioned. Practically, such a matrix is almost singular*, and the computation of [an ill-conditioned matrix’s] inverse, or solution of a linear system of equations is prone to large numerical errors. A matrix that is not invertible has condition number equal to infinity.” (From “Condition number,” Wikipedia, retrieved 6/3/2020. Emphasis added)

*If a matrix is “almost singular,” then likely that is because it’s determinant is almost zero.

.

Disclaimer:

I am not a professional in this field, nor do I claim to know all of the jargon that is typically used in this field. I am not summarizing my sources; I simply read from a variety of websites until I feel like I understand enough about a topic to move on to what I actually wanted to learn. If I am inaccurate in what I say or you know a better, simpler way to explain a concept, I would be happy to hear from you :).

What is the gist of a “Condition Number”?

The condition number… is used to measure how sensitive a function is to changes or errors in the input… A problem with a low condition number is said to be well-conditioned, while a problem with a high condition number is said to be ill-conditioned. In non-mathematical terms, an ill-conditioned problem is one where, for a small change in the inputs (the independent variables or the right-hand-side of an equation) there is a large change in the answer or dependent variable” (From “Condition number“, Wikipedia, retrieved 6/3/20, italics added to highlight the main answer to the opening question).

.

Disclaimer:

I am not a professional in this field, nor do I claim to know all of the jargon that is typically used in this field. I am not summarizing my sources; I simply read from a variety of websites until I feel like I understand enough about a topic to move on to what I actually wanted to learn. If I am inaccurate in what I say or you know a better, simpler way to explain a concept, I would be happy to hear from you :).

What is the Gist of “Oversampling and undersampling in data analysis”?

If you have an imbalanced dataset, “you can change the dataset that you use to build your predictive model to have more balanced data.

“This change is called sampling your dataset and there are two main methods that you can use to even-up the classes:

  1. “You can add copies of instances from the under-represented class called over-sampling (or more formally sampling with replacement), or
  2. “You can delete instances from the over-represented class, called under-sampling

“…These approaches are often very easy to implement and fast to run. They are an excellent starting point.

“You can learn a little more in the the Wikipedia article titled “Oversampling and undersampling in data analysis.

“Some Rules of Thumb

Note that the author of the article lists several other methods to help with imbalanced datasets if you are interested in that.

.

Disclaimer:

I am not a professional in this field, nor do I claim to know all of the jargon that is typically used in this field. I am not summarizing my sources; I simply read from a variety of websites until I feel like I understand enough about a topic to move on to what I actually wanted to learn. If I am inaccurate in what I say or you know a better, simpler way to explain a concept, I would be happy to hear from you :).

What is the Gist of “Imbalanced Datasets”?

Suppose you have a population that is divided up into various classes, such as “male, female” or “0-17 years old, 18-30 years old, 31-60 yeas old, 61+ years old.” If, in your population, each class has the same number of people/objects, taking a random sampling of the population should give you a “balanced dataset.” On the other hand, if you take a random sampling of a population that does not have the same amount of people/objects in each category, you will likely end up with an “imbalanced dataset.”

Imbalanced datasets can be problematic when working with machine learning problems because the program may just predict the most common category and still get a high degree of accuracy. For example, a program predicting the age of individuals in a middle school might learn that it is almost always right if it just predicts that each individual is under 18 years old.

.

Sources:

.

Disclaimer:

I am not a professional in this field, nor do I claim to know all of the jargon that is typically used in this field. I am not summarizing my sources; I simply read from a variety of websites until I feel like I understand enough about a topic to move on to what I actually wanted to learn. If I am inaccurate in what I say or you know a better, simpler way to explain a concept, I would be happy to hear from you :).