YaAAbu Thoughts: September 2022

Friday, 23 September 2022

Third Party Authentication

Third-Party Authentication

What is Third-party authentication? Well, a simple example is when the user signs in to an account at application x using Google, which means instead of application x authenticating the user, it will just ask Google to authenticate the user and then tell the application who the user is. There are three main players involved in the social login process:

The User that wants to access an application.
The Application that wants to identify a user and get related information.
The Authorization provider confirms the user identity and provides the needed data.

One of the motivations are smaller business most of the time can't afford strong authentication service, also the user wants a faster and more secure Login experience, and the third party can get valuable data about the user from these requests, so it will be like win-all situation, so the advantages recap is:

The authentication responsibility moved to a stronger authentication system.
The user doesn't need to enter information that already exists somewhere else, also if the user wants to change his info it will auto-update all linked websites (if they fetch the data each time).
There will be no need to verify some data (like the email when the user uses Google)
The password will not be reused in a multi-application so the user will worry less about leaking the password as the user enters it less often.
The user has control of which info is to be shared and has access to display and disable third-party sign process from the original account.

The disadvantage is that third-party accounts will have wide access to a collection of accounts so hacking the third-party authentication be more valuable as time passes, it mainly acts like a password manager but with the ability to deactivate any account.

The other disadvantage is the application will have limits based on the third-party restriction, privacy policy ... etc., which may affect the sign-up process and add more steps that make the third-party option not very practical.

How is third-party authentication work?

Third-party authentication is only possible using tokens, no one will feel secure if random applications asked him/her about important account login credentials.

The sign-up and sign-in flow are very similar: the application will redirect the user to the third-party website with some info sent (like the identity of the application and the access permission (scope) the application needs, and the third-party verifies the user identity and gave the user in the call back a code, this code will let the application to request a token with the mentioned scope, the code will work only for this application and will give it a token to fetch only the scope displayed to that user, finally using this token the application will fetch these data and insert it into their database (if needed).

There may be a mini adjustment on the flow based on the application, some application wants to add a more secure flow, and other application will need more data that can't be fetched from the third party (missing or hidden).

The type of token will be mainly an id token or access token with limited scope based on the application.

One of the problems here happened when the application tries to integrate with different third-party services, the application will need to implement each one of them based on the provider and this sometimes results in many duplicate efforts across the small applications, one of the suggested solutions is instead of integrate directly with each one (google, Facebook ...etc.) let's integrate instead with a provider that already integrated with them (like Auth0, Firebase, ...etc.).

Here is an example of OAuth2 authorization, you can check different third-party providers and notice there is a slight difference between each one.

c# - How to create custom login provider like third party login provider? - Stack Overflow

Sunday, 18 September 2022

The Success Secret

This small article is about a hot topic that has many discussions in the current world "The way to success".

For the last few months, I skip several books that talk about different topics (most of them have many recommendations and are labeled as "best-selling in XXXX"), there are multi reasons that make me decide to skip them like poor writing skills, filling the book with random things not related to the book topic ... etc., one of them is the authors used annoying patterns for me, he just throws some talk to you like it's known facts, these facts aren't based on experiment or proven knowledge, only based on the author's POV and many of times this POV is based on things like "Survivorship bias (or survivor bias)" and "The holy success"

Survivorship bias

This term means looking only at the survivors instead of looking at the whole group, this bias is one of the most famous reasons that result in false results on the surveys, and beliefs and result in wrong decisions in the future.

Many speakers marketing their self by the mention "X attended my courses" or "There are Y people who attended then became successful persons", They never mention the thousands who attends then had trouble failing.

I remember a few years ago one of the lecturers (I love to hear) said that if you pick a random group (with the same ideology and business) most of the time it will have about the same distribution of success and failure as a group attends to the "success course".

That doesn't mean success is totally just a random thing or that self-improvement is a myth, what I want to say is doing x to improve y is a relationship that needs a lot of experiments and surveys to prove it, not a random guy told you it is proven to be true because he believes that or he tried it.

It worked for me

Sometimes experiments to measure the relation between two variables consume millions of dollars and much time, the most critical thing in most of these experiments is not the measurement process, it's how to find out all other variables that could affect this relationship and make sure they will not change during the experiment, even after doing this step they need to repeat the experiment multi-time to make sure the result will be the same.

It's a foolish idea that a single experiment is enough after said the new facts the author will face many cases that said this isn't true fact, and the answer is known to the author "That's because they didn't do it in the right way" that "right way" will be nothing like clear steps or set of rules it will be just bunch of words in the author's mind and probably each time asked about it will say the cloudy answer that no one will understand the exact meaning with the sentence "if you understand what I said you will succeed".

This point isn't against sharing the self-improvement experiment but tying the action with results is something that isn't reliable, imagine someone trained for 4 years and before the Olympiad, he had an accident and result he couldn't contest does that mean all that he did was wrong? imagine a person who did nothing in his entire life except sleep, eat, and play however he managed to gain millions from his rich father, does that mean doing the same will make you gain millions?

The survey approval

While there are many times you hear someone talk about a survey that no matter how much you search you will never find, there are other types of a survey when the details of that survey (like the selected groups, units, other variables ...etc.) you can't find, it's an easy thing to go to a school for rich families and go out with a survey that said the average pocket money for school student is 100$ per day, or go to a village where most women don't work and said that men have 30K$ more salary than women.

The one who does the survey is human which means he/she can be biased, lair, or stupid, normally that applies also to organizations, even when someone wants to get the truth, some things could affect the result without noticing, and the same people can answer the same questions different answer based on the order of questions, writing format, environment, time ... etc.

The holy success packages

When talking about success many people have a really strong bias toward the survivors, which results in many cases following each step of them without thinking about if it's logical or not, if someone does x,y,z and became the most successful person in the people's POV, most of them will just think let's do x,y,z regards if x,y,z is good or bad.

For them, it's an illegal thing to think x was a minor factor of success while y does not, and z was the most critical factor in the process.

I am not saying that I know the success way All I want to say is if success was an easy thing that anyone can achieve through reading a book that has a bad writing style then it will be different than success in my mind.

the image reference : [https://kenzie.snhu.edu/blog/what-is-success/]

Thursday, 8 September 2022

Authentication Tokens

Token-based Authentication ✋ - Definition, Types, Pros and cons

In the last Article, I discussed the password method to identify the user, before diving deep into the other methods I want to discuss the authentication token.

After the user login and the system recognize him, it will be bad practice to send the login credentials (password) at each request as storing and sending sensitive data frequently is something the server should avoid also that makes each API do the login process at the back-end to confirm that's is real user, and it will be impossible if the sign-in method changed periodically (like OTP or 2FA). 2 methods that handle this story here: Session and token.

Session:

This method focuses on solving the problem that the browser needs to store the password and send it each time to the server, in this method the server creates a new session and adds the user data(ID, IP, browser, action time ... etc.) to it and return the session id to the user and will be stored in the cookies, there are some problems with that method one of them is the session data stored in the database that means each request will make an extra call to the database also that will be a way to attack database bottleneck.

However, the session method is still used worldwide, the advantages of this method are it's simple to understand and implement, can track the user action easily in one place, and also the server can disable the session just by deleting its row from the database.

Another method I want to discuss today is the token, the server will return the token to the user, and the user sends it each time with the request and as long as this token is valid the server will not ask the user to log in again.

What is the authentication token?

Token generally is an authentication string that contains some info and is encrypted in a way that makes it impossible for the third party to modify it and still valid, that means anyone can't just simply change the user id and use the token as someone else.

An authentication token is formed of three key components: the header, payload, and signature:

The header: defines the token type being used, as well as the signing algorithm involved.
The payload: is responsible for defining the token issuer and the token’s expiration details. It also provides information about the user plus other metadata.
The signature: verifies the authenticity of a message and that a message has not changed while in transit.

The big advantage here is that the server doesn't need to store any data related to the token in the database, it just validates the token to make sure no one changes the message. another advantage is it enables giving different scopes to the different users, which means it supports something like a third-party sign where the user wants to share only parts of their information with some apps without giving the app the ability to change their data, post something, or see their payment methods.

The additional data can be so useful as the server can add data that can be needed in some cases like user type that determine if the user is authorized to do any action or not these data should be static and don't change frequently the token life span in normal cases, if it's not it shouldn't be included in the token and that was one of the main weak points of the token that if the server wants to change anything about the token the server can't do anything except send a new token to the user and hope there is no one use the old token until it expires or adding database that handles this point.

The most weakling point the server should be aware of is there is still a possibility that someone steals the token from the device and pretend he is the user, but the providers are aware of that, some give the user an option to disable the token if needed, and some just provide the users with short-life token so if anyone steals it the damage will be limited (this can explain why to change password require to re-enter the password while forgot password not).

The problem of "How to disable the token" is a bit interesting, when the server makes the token include all things it needs (User ID, User Type, Access Type-scope-...etc) and anything changes for example like the user is blocked or the user wants to destroy this token because the user doesn't need it anymore, the server will have 2 option to handle these things:

Move any data that may change like user type out of the token responsibility to the database responsibility that will represent the need to check the database problem but this can be limited if the server handles it by checking these parameters at only critical APIs like payment and changing user data, that means that the server limited the token instead of disabling it as the someone can view data using it so if view the data is important to the server can't use this method, mainly it's a trade-off that dependence on the application.
Make the token expire too early so if someone has a way to get the token, it becomes expired quickly, however making the token life too short is something that will add overload on the token generator and the user will be unhappy when logging frequently, that can be solved by having constraints inside the application itself for payment and security changes and instead of replay with short life token to the login credentials replay with a refresh token.

There are different types of tokens depending on the expiration data and the data provided with the token let's mention some here:

Access Tokens: these are credentials used to access protected resources, and are used as bearer tokens. the access token allows the user to access and do actions based on its scope, mainly it was supposed to be a short-life token that expires after some minutes or a few hours but you can see some application that uses access tokens those last months that mainly depends on how they handle the token entirely.
ID Tokens: these are proof that the user has been authenticated. This token mainly carries the user info and is used to verify the identity and never give any access to any protected resources. This is also a short-life token that mainly lasts for a few hours.
Refresh Tokens: these are long-life tokens that are mainly used to generate new tokens of the other types, it solves the problem that the user should re-login when the token is expired, but of course, these tokens are tracked and the server can check if they are disabled or not as these will be used away fewer than the rest tokens, these tokens work as a hashed password except it will expire after months and will grant its user tokens that may have limited scope.

Notes:

I talked mainly about online tokens but there are other types of tokens like hardware tokens where the token is stored on a specified device, the 2FA is a type of token called Disconnected Tokens but we will discuss it in a separate article.

If you are interested in how a signature is made and why it's impossible to edit the message without affecting the signature you can read more about Public-key cryptography as a start.

Image reference: https://www.wallarm.com/what/token-based-authentication

Friday, 2 September 2022

Account Security using Password

Password Security Guidelines: Everything You Need to Know | SpyCloud

In this article, I will talk about the first and the oldest security method, the password as this article will be part of a series of articles about account security methods and why we hear the word "Password-less" more often lately.

Talking about account security will take us to the “Password-less” topic, a lot of big tech companies put many efforts into this topic these days, in the old times account security targeted mainly the admins and employees, as there was no real value from stealing random emails with nothing attached to it, however, account security importance increases each day as there are many websites that support the online purchase and there are accounts that contain payment cards and other things.

Let’s first know how the signup/sign-in process is done. there are three models in that process: user, network, and server, mainly the network will be out of scope in our discussion but there are security methods supported by the network system to protect the user from a man standing in the middle of the connection between users and servers and steals users’ info.

In the past, there is few methods to sign in and the most popular among them is the password, the applications stored passwords in the database, and for each login request, they compare the given password with the stored password, which was bad practice because if someone gets into the database, he can see all passwords and can reuse them at this website or other websites.

Currently, most companies rely on the fact that the user has only one password, and they only need to know if it matches the input text or not, so they start to use hashing on the passwords before storing them in the database when login requests received hash the given text and compare it with the hashed string in the database, maybe someone sees there are no different however this step will make the database doesn’t hold the password value as the hashed strings don’t hold any value and can't undo it to the original password so you eliminate one of the big sources the password can the leak from it.

So, the problem is solved why password-less? let’s talk first about why the password isn’t a strong method as the user imagined, here are the most reasons passwords can’t scale as an account security method:

Constant: most applications let the password be constant (for years) each day passes with the same password making the account easier to be hacked.
Personal based: that means it mainly depends on the user's mind and personal info, a user with no security background (which is the majority) will choose an easy-to-remember password which will be easy to guess as well, the problem here is people don’t notice they construct a password with their public information that anyone can get online (name and birth date).
Constraints: when websites don’t force constraints on passwords some users set very weak passwords, however adding constraints didn’t improve the overall result, it made the password more predictable by adding constraints like must have capital letter result for the majority of passwords to have the first letter capital, also one important point here is password constraints are visible for all so hacker also knows that password constraints that all password has.
One for all: the majority of users use the same password for everything on the internet, they use the same password for email, social accounts, bank accounts, online shopping website, and some random websites with 0 security (websites that sends username-password pair as a parameter in the URL at login), most of the hackers know that as well so getting email-password pairs from random website worth an effort as these pairs can be used on other websites.
Not random: most people think that brute force password is hard because it’s a random string, most cases didn’t apply, the user will type a password in English, mostly the first letter will be capitalized, and the letters construct a name or a word … etc, thinking carefully about these facts will make the person realize that brute force on password isn’t that hard.
Common: there are common passwords that many users use like “Monkey”, “welcome”, “password” …etc, and there are also famous patterns like Name +’@’ + birth year.
Attacks became stronger: after each attack, the hackers know more about the passwords patterns and develop more powerful tools that become more accurate to guess the passwords and each new generation of devices gives the hacker more computing power to guess the passwords and run stronger tools.
Hard to fix from the server side without side effects: trying to handle any of the above points from the server side will have a high chance to decrease the signup convert rate which hit the online business growth, also by forcing Users to use strong password will increase “forgot password” traffic rapidly.

Some people use a method that each group of accounts has one common password (group accounts based on security levels or the importance of these accounts), but this still holds the One for all problem, breaking into one account gives the hackers access to other accounts, add to that the fact that most of the time these different passwords have common patterns, another solution is a password generator which provides the user with a new randomly generated password for each account, but the password generator has many problems:

The access method is still non-random generated passwords means you only group the above problems into one big problem.
It saves all passwords in one place with no hashing, which means any hack into it will be valuable.
It will not be reliable to ask users to change passwords for 100 or more accounts when someone successfully hacks their password manger.
Password generators will not solve user behaviors, such as sharing a password and logging on to random devices, and Most users will not use them to their full potential.
Missing advanced features that help users be more flexible, which means you need to use it only on personal devices and don't have the option to login into a temporary device and then mark it as a strange device (will talk about it more when talking about the break detection).

Still, the password generator will be a better solution for websites with less value to the user (doesn't contain payment or personal contact info that may lead to a hack chain) and you will not be worried about low-security accounts if the password leak from the server side.

I will stop here for this article so it is still short and easy to read, will discuss in the next article what are the other Authentication methods.

You can read more about anti-pattern passwords from this paper: [password-guidance]

You can read more about how easy to hack using password patterns: [Choosing Secure Passwords]

Image reference: https://spycloud.com/solutions/password-security/

YaAAbu Thoughts