Let's get started explaining some day-to-day jargon used in the software development community, which should make your beginners days as easy as possible.
Pretty much all the topics listed below are significantly simplified for the sake of easier understanding. The idea is that you, as a reader, will be able to find more information about each topic online.
I remember I had significant frustrations at the very beginning of my career, being unable to understand quite some topics I heard for the first time. With the time it got better, but if there was someone at that time to answer all of the questions I had then - my life would have been much easier!
So, let me help you there!
API
A set of functions and procedures allowing the creation of applications that access the features or data of an operating system, application, or another service.
An application that's not integrated with a file system, file system, some remote service (e.g Facebook API, Github API ...) doesn't offer too much functionality on its own. An example of an API usage would be a backend application accessing the local file system's APIs to read file contents, or a ReactJS frontend application accessing Github Rest API to fetch some data to be presented on a web page.
It's a very common thing today one application communicates with various APIs.
Docker
A piece of software which allows us to ship our software in such a way that the one who wants to use it doesn't need to install or configure any other software to be able to use our software, apart from the Docker itself.
The idea is to package our software in a Docker understandable format, and then push it to some online registry, where others can take it and use it.
Version control / Git
Version control refers to the ability to control the versions of the digital artifacts, whatever it may be (e-books, word docs, mp3 files...).
In the software development sense, we're talking about the ability to manage versions of our software, as we change it.
Version control systems do not need to be exclusively used for code management: we can use it to version our images, word documents or any kind of digital artifact.
The advantage of versioning is the ability to track the progress of some software product, but also by versioning we have
a chance to easily jump from version to version and that may come handy in case we introduce a bug in version 1.4.2
,
but luckily we know that a version 1.4.1
worked just fine.
Git is a cross-platform software which technically enables us to manage software changes, by creating new versions.
Git as the software is what behind the scene powers services, such as Github.
Open-Source
Every piece of software is based on some source-code that was written for it to be able to execute. There are 2 kinds of the source-code:
- Closed-source
Examples of these would be Windows Operating System code-base, MacOS one, etc. Closed-source is mostly developed by the companies which do not want to publish their source-code, because the code is their competitive advantage, and brings them financial income.
- Open-source
The opposite side, which promotes values of public visibility of the code, so that everyone can access and view it. An additional step is an ability for everyone to change the code. This lead to the enormous popularity of public tools, like Github or Bitbucket, which provide an easy way for people to collaborate on software projects as well as to share code from each other.
Merge / Pull requests
A means (formal process) of contributing to the existing code-base. The process looks like this:
- There is an existing software code-base
- User A wants to extend existing functionality / fix existing bug
- (S)he prepares a bugfix or a feature on own development machine
- There is a User B who is considered a project maintainer (the one who knows the project well and ensures its quality)
- User A informs User B that (s)he wants the new feature/bugfix integrated, which is a formal process (most often) performed via Web UIs provided by the source-code hosting platform, e.g Github, Gitlab, BitBucket, etc
Synchronous / asynchronous execution
Synchronous might be more easily understood if we use term blocking instead. Some things need to be synchronous/blocking by their nature. Imagine you have a sample code:
function completeCheckout() {
storeOrderInDb(); // takes 2s
transferToOrderManagementSystem(); // takes 3s
sendConfirmationMail(); // takes 3s
}
Ideally, we should not design our systems to be synchronous / blocking. If, in the example from above, we decide that all 3
functions storeOrderInDb
, transferToOrderManagementSystem
and sendConfirmationMail
are synchronous, than user will have to
wait 8
seconds in total until he gets a response from the server.
If you, instead, decide that only the first method, storeOrderInDb
is synchronous, and other 2 aren't, that means that user will
have to wait only 2 seconds, until his order is successfully stored in the database, and his order will eventually be sent to order
management system and he will get a confirmation e-mail (this can wait a bit, and the user doesn't need to get this done immediately).
A good rule of thumb would be that we should design for asynchronous, and the user should not be blocked unless there is a valid reason to do it.
Software versioning
Software evolves over time.
It gets bugs fixed and new features added, as well. It can be that, for whatever reason, some user wants to use the first version of our application, whereas some other user wants to use the latest one. We need a way to store our updated software alongside its version somehow.
There are a couple of versioning schemas, I believe the most relevant one is called a Semantic versioning.
One additional reason to have versioning schema in place is an easier
way to rollback some software that contains a bug for example, to a previous, known to work version. So, if we have rolled
application version 1.2.3
and we want to roll it back, we could simply, from our software registry pull the version 1.2.2
and roll it out to the customers.
Logging
When developing our applications (mostly backend systems), it's very important that during the application runtime execution we preserve some sort of messages containing important information about a system or user behavior.
An example would be that if a user enters 3 times wrong login credentials we store this message so that we can troubleshoot/diagnose given use-case. The recommended way of doing it would be writing these messages to the standard out (console), which can later be searched through when needed.
Another case would be that in case our code talks to the database, and there's a connectivity issue, we could store the message, something along the lines: "There was a connection timeout connecting to the database x.y.z".
Often, these logging libraries contain various logging levels, such as DEBUG
, INFO
, FATAL
which help us set the importance of the things we are logging: we can then, for instance,
filter logs which are considered FATAL
and then do the analysis and the troubleshooting.
JSON
JSON
is a human-readable and a portable data serialization format . It's famous for being used for a data interchange between
browsers and servers, but not exclusive to that.
An example of JSON
serialized (formatted) data would be:
{
"type": "order",
"id": 12345,
"total": "255.59$",
"items": [
{
"name": "Ipad Pro",
"quantity": 2
},
{
"name": "Dell XPS 15",
"quantity": 3
}
]
}
JSON
provides a couple of data types, such as strings
, numbers
, objects
, arrays
, and all of them together are enough to
express any sort of data, which is one of the reasons why it's heavily used (apart from human readability).
Request / Response
Very popular interaction model between two (software) parties, where one side (client) initiates the communication by sending a request to the other party (server) which upon understanding and processing the request sends a response back to the client.
Scalability
A frequently used term is software scalability, which is an ability of the software to cope with the ever-increasing load, often as consequence of an increased number of users using the software. Applications should be designed in such a way to be able to handle the increased usage.
Abstraction
You will very frequently come across this term, as0 you progress in your career. Abstraction is a fancy name for hiding some complexity under some name. What does that mean?
Suppose we have a method
const getUserDetails = (userId) => {
const baseUserData = await http.get("https://api.users.com/users/${userId}")
.header('Accept', 'application/json')
.header('X-API-KEY', 'xxxx-yyyy')
.call();
const dbRow = dbClient.connect('my-database').executeQuery(
"select * from users where user_id = ${userId}"
).first();
return {
...baseUserData,
...dbRow
};
}
This method, as you may guess by being patient enough to read it entirely combines user data from the two datasources: HTTP
API and the
database. This imposes non-trivial cognitive load on the reader, which we may reduce by introducing abstractions:
const getUserDetails = (userId) => {
const baseUserData = getBaseUserData(userId);
const dbRow = getDatabaseDetails(userId)
return {...baseUserData, ...dbRow};
}
const baseUserData = (userId) => {
const baseUserData = await http.get("https://api.users.com/users/${userId}")
.header('Accept', 'application/json')
.header('X-API-KEY', 'xxxx-yyyy')
.call();
}
const getDatabaseDetails = (userId) => {
dbClient.connect('my-database', 'my-username', 'my-password').executeQuery(
"select * from users where user_id = ${userId}"
).first();
}
Here, we reduced the cognitive load to the user by hiding all the lower-level details under two abstractions: baseUserData
and getDatabaseDetails
.
The immediate advantage is that understanding getUserDetails
function is no longer a complex task. Another advantage is that two newly created
abstractions can even be reused in some other place in our applications.
This is just a short intro to abstractions, which is something you will likely master throughout your career.
Upstream / downstream
You will hear these terms relatively often: upstream server - downstream server.
Or upstream job - downstream job.
I was having difficulties with this one until I came across the article that simplified this one for me:
Upstream is a message sender. Downstream is a message receiver.
Imagine you have an HTTP
client and an HTTP
server.
In the case of:
HTTP
request triggered by the client, the client is upstream and the server is downstreamHTTP
response to this request,HTTP
server is upstream, andHTTP
client is downstream
You can also hear upstream job, downstream processing, etc.
Database
We use databases when we want to have some long term accessible data. There are many database solutions out there, optimized for particular use-cases.
Databases ensure that the data we put there stays there as long as they are needed. Often, these databases ensure that the data persisted is also backed up, to ensure there's no data loss.
Apart from the ability to store the data, the databases provide a way to query the data and access it.
Two very popular types of databases nowadays are relational and document databases.
Associated with databases is the so-called CRUD
acronym. It refers to C
reate, R
ead, U
pdate and D
elete, that are the actions we can perform over database content.
HTTP
Means for reliable exchange of text-based, human-readable content between different applications.
The core of HTTP
communication is the client/server and request/response pairs. The communication goes like:
- HTTP client application (say, your browser) sends a
HTTP Request
to a remote HTTP server - HTTP Server receives the requests, and responds with an
HTTP Response
which then client handles in some way.
An example of the HTTP
communication would be the one between a browser and a server needed to fetch resources for rendering
web page: html
documents, css
styles, JavaScrtipt
scripts, Ajax
requests, etc.
The most important part of HTTP
is the URL, which is a location of a remote resource we are communicating with, for instance
http://www.twitter.com
.
HTTP
supports the concept of verbs, such as GET
, POST
, DELETE
which provides a way to express more clearly what do try to do with
the remote resource, like getting some resource, deleting it, and so on.
Rest API
Rest API
is layer sitting on top of the HTTP
protocol, which is based on the concept of resources.
Rest API
, like any other API
, is used to build applications by integrating with them. In this case, integration is done using the HTTP
client.
Many popular platforms today, like Facebook, Twitter or Github provide their API
s, which they use themselves (e.g.
Facebook website or their Android and IOS mobile applications are consuming their API
), but also to be consumed by the others.
As mentioned, resources are a central concept in the Rest API
world. One resource can be (let's use Github API
as an example) a
repository, and Github API
states that we can manage particular repository via their API
s, by accessing particular URLs and
sending correct input data using proper HTTP
verbs.
Apart from Rest API
s, there are also emerging GraphQL API
s, which are also sitting on top of HTTP
protocols but differ in
philosophy of how the data should be accessed and managed.
IDE
Stands for an Integrated Development Environment, which is a piece of software used to develop applications. Some popular ones: Visual studio code, IntelliJ IDEA, etc.
IDEs differ from classical text editors by offering way richer set of refactorings, and way deeper awareness of the frameworks and programming languages.
Agile development
Agile development refers to the way the work is organized nowadays when developing software. This is more like ideology, and there are a couple of concrete methodologies present: Scrum, Kanban, etc.
The idea is that software should be changed frequently, in small increments, so that each increment can be validated by the user as soon as possible.
By being able to frequently release smaller software increments, we can get the feedback from the users, but also we are able to more easily recover from the bugs we may have introduced (since the scope of small changes allows for easier troubleshooting).
Testing
Testing is the formal process of asserting software quality. Ideally, we want to test the things ourselves, before we hand our software to our customers so that they can use and test it themselves.
In general, we have two types of tests:
- Manual
Manual testing involves 1 or more people testing applications by hand, by interacting with the mobile application, or browser-based websites, by going through a series of scenarios (use-cases) and asserting that what was expected happened. This testing category is very expensive, and error-prone on top (anything involving people is risky), so it should be minimized.
- Automated
This group of tests should be the desired way to test the software before releasing it. The idea is to write a set of tests that can be executed by machines, thus they will be much faster and less risky than when performed by people.
One test, implemented in some programming language (let's say Python), interacts with our software, gets the results of that interaction, and asserts that results match expectations of such an interaction.
These automated tests can further be divided on unit tests, integration tests (there's no consensus on what types of tests should exist and how should they be named at the time being)
Debugging
Debugging is most often used in those cases when we want to troubleshoot some issues we have with our application. Technically speaking, debugging is a mode in which we can start the application so that we can inspect its behavior during the execution. There is also a concept of breakpoints, which are the points in code (concrete lines in source files) we'd like to pause application execution.
How it works:
Set a breakpoint in the source file's concrete line you'd like to pause your application's execution
Trigger the application flow that would cause the line you set to be executed
When the execution reaches your line, it will pause there, giving you control to:
- resume it
- inspect all the variables (global, local)
- continue line by line
- step into functions...
This is mostly used by developers on their machines, but sometimes it's also possible to debug processes running on remote machines.
Compiling
Many programming languages (like Java, C#) require us to create an intermediary representation of code, which is both required and optimized for the runtime and is the representation that is required to run the applications written in these languages.
For instance, in case of Java programming language, we need to transform Java source code files into Java bytecode files. These are later used to run our application using Java runtime environment
There are also languages, such as JavaScript for instance, that does not require us to prepare any intermediary representation, but instead, source files are directly used by the application execution runtime (in case of JavaScript - a browser or NodeJS).
Client / Server
This is one interaction model (out of a few ones) between the two software parties (processes) where:
- client is the party which initiates communication with the server
- server is the party which responds to the requests sent by the client
This interaction model you can find, for instance, in web browsers, where the client (the browser itself) requests web resources from the web-server (which is server in this case)
Servers never initiate this interaction.
Deployment
Deployment is a process of installing your software on a server, making it available for end-users. Deployment frequency and duration can vary significantly between different applications, depending on software complexity, a technical debt of the applications, deployment tooling efficiency, etc.
Companies are deploying their software several times during the day, but some companies deploy very rarely, once in a month, or even more rarely.
The deployment should, ideally, be an automated process, consisting of various testing phases, packaging and versioning software release, installing it on a server, and performing some health checks to ensure the success of the deployment
Caching
Caching is a process of remembering values of some time-taking or resource expensive computations so that they can be reused for the subsequent, identical calls.
In some systems, querying a database is a very expensive operation time-wise, and we might want to cache some results that can be cached.
For instance, imagine we have an e-commerce platform, and we want to present categories to the users, which are coming from the database. These categories are not changing so frequently, so we might want to protect our database server being querying each time each page is displayed for each user by caching these results once they are returned from the database the first time.
We often store this cached value in server memory, or we can put this information to the external tools, such as Redis or similar, which have much better read performance than the databases.