Use Case of K-means Clustering in the Cyber Security Domain

What is Clustering?

In simple words, the aim is to segregate groups with similar traits and assign them into

What is K-means Clustering?

The way the K-means algorithm works is as follows:

  1. Specify the number of clusters .
  2. Initialize centroids and then randomly selecting data points for the centroids.
  3. Assign all data points to the closest k.
  4. After that, the positions of the k centroids are recalculated
  5. Steps 3 and 4 are repeated until the positions of the centroids no longer move.

Use Case in Cyber security Domain

(a) Cyber Profiling

Cyber Profiling process can be directed to the benefit of:

  • Identification of users of computers that have been used previously.
  • Mapping the subject of family, social life, work, or network-based organizations, including those for whom he/she worked.
  • Provision of information about the user regarding his ability, level of threat, and how vulnerable to threats.
  • Identify the suspected abuser.

In a broader scope of cyber profiling can provide support information in a case, such as counterintelligence and counterterrorism.

The new approach to cyber profiling is to use clustering techniques to classify the Web-based content through data user preferences. This preference can be interpreted as an initial grouping of the data so that the resulting cluster will show user profiles.


Preprocessing is performed to remove duplication of data, check the data inconsistency, and correct errors in the data, such as print errors (typography).

Thank You :)



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store