Craig Risi

Oct 46 min read

Securing Data in your code

In my last post, I looked at some tips on how to write more secure code. However, it’s not just code alone that developers need to worry about as data and the data their code has to call and interact with also require attention.

An important part of any application and code effort is data so we can’t ignore looking at how to best protect data within our applications and coding habits as well. Protecting data in software development is not just something we need to do at our public endpoints and database layers, but should also form an important part of how we develop code. No matter where your code sits in your architecture, how you transmit data or protect sensitive information in your code is vital to ensuring you can robustly protect data from all levels of threats.

Below are some good tips to follow that will help ensure your code is able to protect all forms of data:

Encryption

Data at Rest: Ensure that sensitive data (like passwords, financial information, or personal identifiers) is encrypted when stored in databases, file systems, or other storage mediums.
Data in Transit: Use secure communication protocols (e.g., TLS/SSL) to encrypt data during transmission between servers, APIs, and clients. This prevents interception or man-in-the-middle attacks.
End-to-end Encryption: For high-security applications, use end-to-end encryption (E2EE) where only the communicating users can decrypt the data, not even the service providers.

Input Validation and Sanitization

Sanitize Inputs: Prevent injection attacks (like SQL injection or cross-site scripting, XSS) by validating and sanitizing all user inputs. Use prepared statements or parameterized queries when interacting with databases.
Validation Libraries: Use well-maintained validation libraries to check data against expected formats, ensuring only valid inputs are processed by the application.

Avoid Hardcoding Sensitive Information

Environment Variables: Store sensitive information like API keys, credentials, and encryption keys in environment variables or secret management services rather than hardcoding them in the codebase.
Secret Management Tools: Use secret management services (e.g., AWS Secrets Manager, HashiCorp Vault) to securely store and rotate sensitive data.

Hashing Passwords

Strong Hashing Algorithms: Store passwords using strong hashing algorithms (e.g., bcrypt, Argon2) that include salt to prevent attackers from using precomputed hash tables (rainbow tables) to crack passwords.
Peppering: Add an additional layer of randomness (pepper) to passwords before hashing, which is unknown to the attacker if the system is compromised.

Code Audits and Static Analysis

Automated Security Scans: Use static code analysis tools (e.g., SonarQube, Veracode) to identify security vulnerabilities in the code, such as buffer overflows, insecure dependencies, or improper input validation.
Regular Code Reviews: Conduct thorough code reviews, especially for areas dealing with sensitive data or security-critical functionality, to catch potential vulnerabilities early.

Secure Error Handling

Limit Information Disclosure: Avoid exposing sensitive details in error messages, such as stack traces, database queries, or sensitive data. Ensure errors are logged securely but do not provide attackers with useful information.
Graceful Degradation: Ensure your system degrades gracefully in case of failures, especially for sensitive operations, so that data is not unintentionally exposed.

Use Libraries and Frameworks Carefully

Avoid Insecure Libraries: Regularly audit and update third-party libraries or frameworks to ensure that known vulnerabilities are patched.
Dependency Management: Use tools like npm audit, Snyk, or OWASP Dependency-Check to monitor and secure dependencies in your project.

Logging and Monitoring

Log Only What’s Necessary: Ensure that sensitive information is not logged (e.g., passwords, encryption keys, personal data). Use logging frameworks that support redaction of sensitive data.
Monitor for Suspicious Activity: Implement real-time monitoring and anomaly detection tools to identify potential data breaches or abnormal behavior in your application.

Data Anonymization and Masking

Data Masking: When displaying sensitive data (e.g., credit card numbers), mask parts of the data (e.g., show only the last four digits) to reduce exposure.
Anonymization Techniques: For sensitive datasets, use anonymization techniques to remove or obscure personal identifiers, ensuring compliance with privacy regulations (e.g., GDPR).

Regular Security Patching and Updates

Patch Vulnerabilities Promptly: Ensure that all software components (including third-party libraries) are kept up to date with the latest security patches. Vulnerabilities in libraries can be a major attack vector.
Automated Security Patching: Use automated tools to track and deploy security patches for both code dependencies and server infrastructure.

Some Basic Examples of Secure Code Practices with Data

Much like last week's post though, I want to provide you with some basic examples that can help to showcase some small, simple changes that can be made in your code that can already help to protect data. There are far more complex scenarios that many organizations will need to look at, but will perhaps leave that for a bigger post due to the complexity of that code.

Still, even just following some of the below basic code examples (all shown in Python code) can make a big difference in ensuring the code is secure.

Securing data in code involves using encryption, hashing, access control, and secure communication methods to ensure data integrity, confidentiality, and protection from unauthorized access.

Below are a few practical examples:

Encrypting Sensitive Data

Insecure Code:

# Insecure: Storing sensitive data in plain text
credit_card_number = "4111111111111111"
store_in_db(credit_card_number)

Problem: Storing sensitive data like credit card numbers in plain text makes them vulnerable to unauthorized access if the database is compromised.

Secure Code:

# Secure: Encrypting sensitive data before storing it
from cryptography.fernet import Fernet
 
# Generate a key and instantiate the cipher
key = Fernet.generate_key()
cipher = Fernet(key)
 
encrypted_data = cipher.encrypt(credit_card_number.encode())
 
# Store encrypted data and key securely
store_in_db(encrypted_data)
store_key_securely(key)

Solution: Encrypt sensitive data using a symmetric encryption algorithm like AES with a secure library (e.g., cryptography in Python). Make sure the encryption key is stored securely, such as in a key management system (KMS) or environment variables.

Hashing Passwords

Insecure Code:

# Insecure: Storing plain text passwords
password = "userpassword123"
store_password_in_db(password)

Problem: Storing passwords in plain text leaves them exposed if the database is compromised.

Secure Code:

# Secure: Hashing the password before storing it
import bcrypt
 
hashed_password = bcrypt.hashpw(password.encode(), bcrypt.gensalt())
 
store_password_in_db(hashed_password)

Solution: Use a strong hashing algorithm (e.g., bcrypt, Argon2, or PBKDF2) to hash passwords before storing them. Avoid using weak or fast hashing algorithms like MD5 or SHA-1.

Encrypting Data in Transit (Using HTTPS)

Insecure Code:

# Insecure: Sending data over HTTP
import requests
 
data = {"username": "user", "password": "password"}
response = requests.post("http://example.com/api/login", data=data)

Problem: Sending sensitive data over HTTP transmits it in plain text, making it vulnerable to interception via man-in-the-middle (MITM) attacks.

Secure Code:

# Secure: Sending data over HTTPS
import requests
 
data = {"username": "user", "password": "password"}
response = requests.post("https://example.com/api/login", data=data, verify=True)

Solution: Always use HTTPS for secure communication over the internet. HTTPS ensures that data is encrypted during transit using SSL/TLS, protecting it from interception.

Masking Data in Logs

Insecure Code:

# Insecure: Logging sensitive information
credit_card_number = "4111111111111111"
logging.info(f"Processing payment for card number: {credit_card_number}")

Problem: Sensitive data such as credit card numbers should never be logged in plain text, as logs can be accessed by unauthorized parties.

Secure Code:

# Secure: Masking sensitive data in logs
masked_number = credit_card_number[-4:].rjust(len(credit_card_number), '*')
logging.info(f"Processing payment for card number: {masked_number}")

Solution: Mask sensitive data before logging it. For example, display only the last 4 digits of credit card numbers and replace the rest with *.

Using Environment Variables for Sensitive Information

Insecure Code:

# Insecure: Hardcoding API keys or database credentials in code
API_KEY = "myapikey123456789"
DB_PASSWORD = "mypassword123"

Problem: Hardcoding sensitive credentials in the source code increases the risk of them being exposed, especially if the code is shared or stored in a version control system like Git.

Secure Code:

# Secure: Storing sensitive information in environment variables
import os
 
API_KEY = os.getenv('API_KEY')
DB_PASSWORD = os.getenv('DB_PASSWORD')

Solution: Store sensitive information like API keys, passwords, and tokens in environment variables rather than hardcoding them. Use tools like dotenv to manage environment variables securely.

Secure Data Access Control

Insecure Code:

# Insecure: Accessing all user data without restrictions
def get_user_data(user_id):
    return db.query("SELECT * FROM users")

Problem: This code retrieves all user data without filtering based on user permissions, which can lead to data exposure for unauthorized users.

Secure Code:

# Secure: Limiting data access based on user roles and permissions
def get_user_data(user_id, requester_role):
    if requester_role == "admin":
        return db.query("SELECT * FROM users")
    else:
        return db.query("SELECT * FROM users WHERE id = %s", (user_id,))

Solution: Implement role-based access control (RBAC) to restrict data access to authorized users. Make sure each user can access only the data they are authorized to see, based on their role or permissions.

Encrypting Files Before Storing

Insecure Code:

# Insecure: Storing files in plain text
with open('user_data.txt', 'w') as f:
    f.write("Sensitive user data")

Problem: Storing files without encryption can lead to data exposure if the filesystem is compromised.

Secure Code:

# Secure: Encrypting files before storing them
from cryptography.fernet import Fernet
 
key = Fernet.generate_key()
cipher = Fernet(key)
 
sensitive_data = "Sensitive user data"
encrypted_data = cipher.encrypt(sensitive_data.encode())
 
with open('user_data.enc', 'wb') as f:
    f.write(encrypted_data)
 
# Store encryption key securely, not in the same file

Solution: Use encryption to secure sensitive files before storing them. Ensure the encryption key is stored securely and separately from the encrypted data.

Conclusion

Your code and data are essentially inseparable when it comes to software security. It's important that when code is written, we adhere to proper best practices in how we manage the data interactions in our code to ensure that we don't compromise data in any way. Considerable effort needs to be placed on software delivery teams to adhere to best practices and ensure secure coding standards are followed throughout every aspect of the code.

CRAIG RISI

Securing Data in your code

Encryption

Input Validation and Sanitization

Avoid Hardcoding Sensitive Information

Hashing Passwords

Code Audits and Static Analysis

Secure Error Handling

Use Libraries and Frameworks Carefully

Logging and Monitoring

Data Anonymization and Masking

Regular Security Patching and Updates

Some Basic Examples of Secure Code Practices with Data

Encrypting Sensitive Data

Hashing Passwords

Encrypting Data in Transit (Using HTTPS)

Masking Data in Logs

Using Environment Variables for Sensitive Information

Secure Data Access Control

Encrypting Files Before Storing

Conclusion

Recent Posts

Comments

R