Programming, Security, Privacy, Technical

A Study in Password Generation and Validation

For those who would rather just read the code: https://github.com/calebshortt/pwanalysis

[Updated  April 16, 2019] I have included further results in an update at the bottom of this post.

 

Introduction

A study.

On many a dreary evening, once the sun sets beyond the Pacific and leaves only the gold and pink flakes of cloud and sky, I take to my studies — in the literal sense. On my desktop I maintain a growing number of “Studies” that pique my interest and encourage further research. When time permits I dive into one and if such a study bears fruit it will find its way upon these pages.

This study continues my obsession with passwords.

Firstly, I must ask you a question: Have you ever read through a password dump?

I do not mean to ask if you perused the listing then used it in some attack, but did you truly “digest” the list of passwords one by one? Did you consider their meaning or guess on the context in which they were conjured within the mind of one unsuspecting human and in which immensely intimate feelings and thoughts often manifest themselves? For who chooses a password with the thought that this very string of characters, words, symbols, and what-have-you would one day be revealed in its entirety to the world?

“*Gasp!* Plaintext?”, you balk, “But this is 2019!”. Perhaps you should ask Facebook what year it is.

I have found that passwords often reveal much more about an individual than the crude challenge-answer mechanisms utilized for authentication. If a user’s digital identity (i.e. username, etc.) is their public-facing persona then their password is often their private form. It was through my inspection of these password lists that I encountered three thoughts:

  • “Password language” is a language all it’s own and it DOES have loosely established, unspoken, syntax rules.
  • The frequency distribution of characters in passwords could be quite different to a spoken language frequency distribution.
  • The relationship between the personality of a user and their password is non-random (sounds obvious)

Continue reading

Standard
Programming, Security, Privacy, Technical

ShillBot: A Study in Identifying Reddit Trolls Through Machine Learning

For those who would rather just read the code: https://github.com/calebshortt/shillbot

 

Introduction

We’ve all been there. You’re browsing Reddit and see a post that you’re passionate about. You click the comment box and reach for the keyboard — but hesitate. Reddit’s reputation precedes it. You type anyways and punch out your thoughts. Submit.

*Bliip*

A comment already? You click the icon and read the most disproportionately-voracious response to a comment about cats you have ever seen. What a jerk! But you’re not going to play that game, and view the author’s previous posts and comments. Through your review a trend of tactless comments and inflationary responses bubbles to the surface. They’re a troll. You promptly ignore the comment.

Continue reading

Standard
Programming, Technical

An Experiment on PasteBin

A while ago I was browsing the public pastes on PasteBin and I came across a few e-mail/password dumps from either malware or some hacker trying to make a name for himself.

As I perused the information, I was shocked to find usernames, emails, passwords, social security numbers, credit card numbers, and more in these dumps. I reported the posts as credit card info and SSNs are nothing to trifle with, but the thought lingered as to why they were public in the first place. There must be a way to automate the process of reporting these posts, I thought, usernames and especially passwords hold a very unique signature: at least one upper-case letter, at least one lower-case letter, at least one digit, and at least 8 characters long.

How many words in the english language have that particular combination?

Continue reading

Standard