Pegasus Security Implementation Guidelines
Problem Statement
Open Pegasus has a challenging role. It provides a portal
for users and programs to access a wide variety of information on a system.
Pegasus is responsible for user authentication and provides a framework for
the provider authors to authorize specific read and write operations on the
server. Managing the resulting "trust delta" (the difference between
what the provider could do in its current execution context, vs. what
a given user is authorized to do) is hard. The bigger the trust delta,
the greater the incentive to "break in" past authorizations in providers
to "get to" a super-user/administrator execution context(or to the
context of a user that can do something the authenticated user isn't authorized
to do). Though the OpenPegasus 2.5 feature, "run-as-requestor,"
does provide a way to lower the risk to a given provider that takes advantage
of the run-as-requestor context, there are still risks for the providers that
decide to run at elevated privilege (defined as when the execution context has
more permissions/abilities than the authorized users... hard to avoid when the
execution is not as the authorized user).
Requirements, Constraints or Assumptions
Most of the risk of failing to follow the guidelines below are only present
when code is not run as the authenticated user (whether that code be in the
provider or the client) or in deployments for which the concept of authentication
isn't used (i.e.: SNMP-public info). Even for more limited deployments, many
of the problems below can still cause potential crashes/denial of service conditions.
Definitions:
Elevated code: a difference between the actions that
the logged in user is authorized to perform, and the execution context of the
running program. This "trust delta" must then be managed by the code to
ensure that the user doesn't perform more actions than they are authorized
either directly or through side-effect. For example, a process run as the
UID of the authenticated user is thus said to be "non-elevated." A process
running as administrator on behalf of a "non-administrator" user, would be
called elevated.
Privilege: The collection of actions
a process or user is not prevented from doing. An "administrator"/root
user is said to be full privilege with respect to a system since that execution
context does not prevent any action on the system.
Trust: The degree to which an actor
that is interacting with the component under consideration is believed to
behave non-maliciously. For example, an arbitrary user on the Internet has
no 'trust," a junior operator or administration is trusted to no attempt
malicious activity, but may accidentally attempt damaging actions.
Root/Administrator code is trusted to behave correctly.
Security Testing: Non-functional
testing that centers around behavior in the presence of malicious use.
Examples includes testing for crashes or security side-effects in the presence
of overly long inputs, special character inputs, high-system load, and network
storm environments.
Security Side Effect: Applications
often, in accomplishing their goals, perform actions beyond those visible to the
user. Examples include writing temporary files, or clearing or requesting
memory. Since this behavior is not specified in the functional
requirements, it is often not tested. This "side effect" behavior is often
the behavior that a malicious user attempts to leverage when trying to gain
privilege. Examples include exploiting race conditions where a temporary
file is momentarily world writeable, before it is chmod-ed. This window is
an opportunity for a malicious user to insert data that can change the behavior
of the application.
References:
Architecture:
http://www.opengroup.org/security/secarch.htm
Books and References (not endorsed by
Opengroup or partners):
Proposed Solution
General Implementation Guidelines:
Code that doesn't adhere to the following guidelines in elevated code should
be considered a bug, including providers not running as-requestor. it is a best-practice
to follow the following guidelines for all code.
- Avoid buffer overflow vulnerabilities in your code (hackers use
to insert arbitrary code)
Buffer overflows in network-accessible software cause the most, and some believe,
the majority, of software vulnerabilities. Since Pegasus is written in C/C++,
it is especially susceptible to buffer overflows. Strongly consider using
a static tool like Flawfinder or RATS to look for common problems. Dynamic
tools can also be used but only identify problems when the overflow actually
happens vs. finding potential overflows. The susceptibility stems from a lack
of bounds checks in C++ and C. Problematic functions include strcpy, sprintf,
strcat, gets, and strlcat.
- Avoid format string vulnerabilities in your code (hackers use to
read or insert arbitrary code)
Format strings define the format and types of program variables that are substituted
into an input or output string. Exploitation of format strings occurs when
functions that require a format string are coded with a variable, and that
variable is not validated. For example the following is vulnerable code: printf(string_from_untrusted_user)
as the user can supply the format string, and read or overwrite (using %n)
arbitrary data. Each developer must:
- Use functions with a strict format string argument.
- Check the input parameter data for format strings before assigning them
to variables.
- Always validate that the input parameter data does not contain any program-specific
format characters before assigning the input data to variables.
- Always check the return codes of library functions for failure.
- Adhere to the general, good programming practices:
- Always have people other than the coder review the code.
- Always have people other than the coder develop tests and test the
end product.
- Check return codes from system or library calls, and handle errors
or exceptions gracefully.
- Keep your code simple: simple code decreases the risk of defects; complex
code increases the risk of defects.
- Don’t use uninitialized variables.
- Use symbolic constants (such as #define) to minimize typos and improve
code maintainability.
- Use temporary files with care. Do not create temporary directories
or files from your program that are world writeable. Limit the permissions
to what is needed by the program. Clean up temporary files or directories
when you are finished using them.
- Do not put sensitive information in log files. For example, do not print
social security numbers, passwords, credit card numbers, or any other
sensitive or personal information for debugging purposes in the log files.
Sometimes such information shows up in the web browser in case of exceptions
or application failure.
- Enforce strong password policies and a delay on failed logins. This
helps to prevent unauthorized access to private data. See libpam for a
good way to implement this.
-
Validate input to the program or system before processing the input.
Test to see if the input is the proper type of data and in the range
of acceptable or expected values and test both upper and lower bounds.
For example, if you are reading in a year value (int), and you have
already checked for buffer overflow and format strings:
Correct: if 0 <= year <= 3000 then (accept input and process)
Unsafe: if year <= 3000 then (accept input and process)
The vulnerability of the unsafe example is that if someone were to
return a value of –32769, then they could intentionally stop or corrupt
a procedure. If the language which you are using does not enforce
strong types, then a type check should also be performed before accepting
the input.
- Use the principles of least and necessary privilege.
Only grant the minimum set of privileges required to perform an operation,
and grant these privileges for the minimum required amount of time. For example,
if a provider needs to modify both mail queues and print spools, don’t run
the application as root; instead, use /etc/logingroup or other facilities
to give the application the privileges which it needs, but not more privileges
than it needs.
- Use SSL securely:
Please refer to the OpenSSL documentation for usage. Pegasus libraries
help with some but not all functions necessary for certificate management
and usage.
- Handle race conditions securely
Race conditions occur when two or more processes access a shared resource
in an order that was not expected by the program. Unordered access to resources
is common in multitasking environments, and is mostly associated with either
Signal Handlers or File Handling. For example, if a program "A"
checks to see if a file exists before writing, but a program "B"
creates a link after the check, but before the write, "A" may inadvertently
overwrite the link destination with the permissions associated with "A".
This can be a security problem if "A" has different permissions
than "B."
- Use secure defaults when possible, or clearly document when they aren't
used
- Design Securely
Design your code so that as little as possible runs as a privileged user.
All privileged user code (especially if it listens on a network or executes
on behalf of other users) should be inspected very thoroughly, so it should
be short and simple. Each module of code should have a clean interface for
other modules to use and a well-defined perimeter around each module.
Additional details can be found at:
Architectural Patterns for Enabling Application Security (http://www.joeyoder.com/papers/patterns/Security/appsec.pdf)
- Test for security (use both positive and negative tests)
“Positive tests” verify that the functionality of the product works as specified.
“Negative tests” attempt to subvert the security of the system, and are often
overlooked when testing software. Spend some time thinking like a hacker and
trying to break your system. Always test boundary conditions or corner cases
for values of data, size of data, and type of data. Many common bugs are related
to this. Sometimes this type of bug may result in wrong information being
retrieved from the database instead of failing gracefully. Attempt to exploit
the system with buffer overflow and format string attacks.
- Don’t bundle private copies of security code
Security code (especially highly scrutinized open-source code) is likely to
have security bulletins issued against it. When such security bulletins are
inevitably issued against code you depend on, you don’t want to have to issue
a bulletin against your product also. If you put a dependency in your code
to a standard distribution of a component which you need (for example OpenSSL),
rather than embedding a private copy, then whenever a security bulletin is
issued against it, you won’t have to reissue the bulletin after repacking
the fix for your private copy.
General Coding Best-Practices:
- Avoid implementing security functionality: Making security claims
in your documentation (beyond the implied security claims of authentication
and authorization done by the operating system) can increase your risk of
having a security defect. This is because any of those claims that are not
fully implemented or enforced is by definition a security defect and requires
an expedited fix and a security bulletin to announce that fix. Reuse of tried-and-tested
code, that has been used in a security context is always a better choice.
Never implement a random number generator or cryptographic algorithm unless
you're a cryptographer by profession. You will almost certainly get it wrong.
You should always, however, document your security behavior.
- Duplicating authorization code: Related to risk #1, every provider
has the risk of authorization related defects because the authorization done
in each provider is a duplication of the kernel authorization code. However,
you can still decrease your risk by using common API’s. For example, many
providers will need a way to tell if the authenticated user should have access
to a given file. The code which does this needs to check the user id, group
ids of the file in question and all of its parent directories. Any defect
in this code could easily be a security defect and would need to be fixed
in every copy of that code. For this reason it is imperative that this logic
exists in only one place and that your provider uses that copy. Do not try
to replicate this complex logic in your own provider, unless you are the single
owner of that code.
- WBEM provider/client combinations: Writing a WBEM provider that is also
a WBEM client (makes requests of other providers) has security risks/challenges.
There are two subcategories of this risk/challenge:
- using the connectLocal() API uses the UID of the running process to
do authentication. Thus, the provider initiating the request must ensure
authorization of the other provider’s data before making the request.
(This is another example of Risk 2, multiple copies of authorization code)
One feasible way to do this is to check that the user is a privileged
user before calling the other provider (in which case the UID matches
the running process)
- Using the connect() API has additional complexities. Credentials must
be somehow passed into the provider and then handled appropriately. Also,
there are additional client responsibilities as far as certificate validation
and testing, and the consequences are more severe because the client is
running with elevated privileges
Provider Implementation Guidelines
Code that doesn't following the
following guidelines in providers running at elevated-privilege should be
considered a bug.
Code in providers running as-requestor can consider the following as general
best practice.
- Check the username/uid and execute every method as if it was running
as that user (i.e. had the OS kernel or
authorization service done the authorization).
By checking each operation they perform, and ensuring those operations, when
performed on behalf of a non-privileged user, do not have security side-effects.
Any discrepancy between OS authorizations done by the kernel and that done
by the provider that is not part of documented behavior is a security defect.
If the user does not have the privileges to perform the requested operation
the Provider must throw CIMAccessDeniedException.
- Keep your design/provider simple.
While this is difficult to quantify, it is important to minimize the amount
of code running as a privileged user. As a general guideline, if you have significant lines
of code running with elevated privilege, the likelihood of a security
defect is high. Remember that defects in elevated privilege code is a potential
security defect, so all of this code must be straightforward and easy to review
based on the principles mentioned in the General Coding principles above.
The likelihood of a defect not being found is proportional to the amount and
complexity of the code.
- Provider must not use any calls such as setuid or setting environment
variables (i.e. PATH) that would alter the state of the process running the
CIM Server.
This could cause unexpected results for other providers or threads.
- Provider must document property authorizations.
Specifically, the provider should describe which data elements they make available
for reading, which system changes they are capable of making, and which users
will be able to read those elements and make those changes.
- Provider must check all untrusted input for validity.
While the CIM Server ensures that the input is a valid CIM request, the provider
is responsible for validating that the CIM request does not cause any side
effects by ensuring that the input strings contain only expected characters
and that values are within an expected range. Examples of input data that
must be checked include directory or file names, data within files that are
read by the provider, and data returned from system calls.
- Provider must execute stress tests.
These include operation in the presence of multiple interacting provider requests.
Based on a white box analysis of your provider, identify ways in which testing
could stress your provider. For example, sending large input strings, a large
number of simultaneous requests, requests including out-of-bounds data, or
ensuring that every branch is covered are just a few ways that you could stress
your product to find potential defects. By exploring the way your provider
fails, you can look for side effects that might lead to "infinite"
resource requests, overwritten data, or other anomalies that could cause a
denial-of-service or reveal a side-effect that can be leveraged as an exploit.
- Design your provider to expect belligerent input.
For example, have a common method that validates all CIM requests and ensure
that that method gets called for every request. The method should assume that
input is invalid unless it matches a specific format and specific bounds are
checked. Also, if your provider allocates any memory buffers or writes to
any file based on user input, all error conditions (out-of-memory, disk full,
file is a symbolic link/device file/directory instead of the expected format,
buffer/array too small for data, etc.) should be checked and all of this should
be enforced in a common place.
- Do not allow group or world-write access to your shared library,
any other executable code, configuration files, or any parent directory of
any of the above.
Although only a privileged user
ought to be able to create the symbolic links
or shortcuts to the provider
shared library in the designated WBEM provider library directory, the actual
provider shared library can be placed in any directory. A provider must ensure
that their shared libraries are protected in such a way that only a privileged
user can modify or delete the shared library or the directory where the shared
library is located.
Provider Best-Practices:
- Use "UserContext registration" setting, present in Open Pegasus
2.5 and later:
In Pegasus 2.5 and after, you should strongly consider registering your provider
to run as requestor context, or if
not available, use Windows "impersonation" or fork a correct-user-running
process. For providers in versions prior to 2.5, you may
want to consider implementing your own out of process provider, to avoid the
risks of running at elevated privilege. For those that must run privileged:
- Check that the authenticated username provided by
CIM matches the
effective user id of the running process.
For Pegasus 2.4 and
prior, this means that only the privileged user would be able to use your provider.
The general property is that if you are not elevating privilege (running
on behalf of a different user), then the likelihood of a security defect
is greatly decreased. Making your code more general may mean less work in
the future when non-privilege-elevated providers are able to run with the correct user-id.
Even in the model where Pegasus is
run under a non-privileged user, there is a delta in "trust" between the
different users. This still represents some, though not as much, risk
as deploying a run-as-administrator Pegasus. There is an opportunity
to improve Pegasus to better support fully protecting this use model, though
this is less urgent than protecting the higher risk associated with an
administrator running CIM server.
- Recommend configuring the WBEM users group
(ref: PEP 142): For Pegasus versions
prior to 2.5, and subsequent, customers can configure a specific group of users who
has access to WBEM providers. This allows customers to choose a tradeoff
between security risk and ease-of-setup. Since every provider runs with
elevated privilege, the risk of security defects is high. Thus, it is advised
that customers configure this group of WBEM users to only allow access to
users who are trusted not to be malicious. If you also do not run by default,
this information can be in your initial setup documentation so that it gets
to all of your customers. This can greatly decrease your risk of having
a security defect, because all malicious activities can be potentially ruled
out.
- Providers should consider the tradeoff between default installation/registration
and optional: An optional installation of a component (as part of
an OS or software package) gives customers a choice as to whether or not to
limit their interface/exposure, and maintenance/patch burden. Your provider
likely meets a real need for many customers, but there are also customers
who do not need the functionality you provide. There are many customers who
would prefer less patching/update cost and decreased security risk (risk is
added whenever there is a new interface) versus the functionality that your
product provides. Although technically this doesn’t decrease the risk of having
a security defect, it can give you more options for interim workarounds until
you can get a critical fix out, and fewer customers would be affected by any
given defect. Provider writers and bundlers should consider these benefits
and weigh those against the bundling benefits of mandatory inclusion.
- Log important events, such as unauthorized requests: This can help
a customer track down a potential intrusion as well as debug problems. Do
not include confidential information, such as passwords, in the log. Ensure
that the confidentiality of information stored in the log is commensurate
with access to the log. It is recommended that you use a common logging facility,
such as syslog. Syslogd takes care of things like log rotation, etc. and the
administrator already knows where to look for your logs.
- When making system changes, use
platform security checks where possible vs. rewriting your own authorization
code: Duplicating
authorization code at least doubles the work and is more error-prone.
Client Implementation Guidelines:
Note: In general, these are the
responsibility of the applications invoking CIM client libraries to the extent
that the client libraries don't yet provide the direct support.
Client code that doesn't follow
these guidelines should be considered a bug:
- Use SSL as follows in your remote production client. Though WBEM does
provide libraries to help, client behavior is the client's responsibility:
- Protect the Keystore and Truststore for remote production clients:
- Use proper file and
directory permissions to protect keystore and
truststore files.
- If your applications are importing the servers’ certificate to a
truststore, you must ensure that the user validates the certificates
received before adding them to a truststore or keystore.
- Do not use less than 1024 bit keysize to create keystores.
- Keystores/truststores should not be readable or writeable by anyone
other than the user who owns them.
- General programming standards
- Do not use world-writable files or directories (including /tmp and /var/tmp).
Make sure all credentials (passwords/certificates) are readable only by
their owner.
- Do not cache passwords unless directed to do so by the user. The
user should be aware that their password is being stored permanently on
the client machine.
- Do not pass passwords as an option on the command-line in
non-windows clients.
Command-lines are visible to all users on the system in
some operating systems.
General client best-practices:
- Limit access to client data: Each user of a WBEM client should have
his/her own WBEM client instance. The WBEM client process should run
as the correct user on the client machine.
- Local vs. Remote Requests and Username/Password Authentication: Use
the connectLocal() API call to connect to the CIM server whenever possible.
To use this API call properly, the process must run with the correct userid
Warning: For Pegasus earlier than 2.5, doing client
operations from a CIM provider significantly increases your security risk
if the initial client requester was not running as root. This is due
to the implementation which runs the provider in the CIM Server process space
with a single, often privileged, user so the provider it connects to will
be unable to use built-in authentication. Providers issuing WBEM client
operations must adequately address the security risk. A few alternatives
to address the security concern are: 1) ensure (either at design time or at
runtime in the provider) that the user is authorized to access the data being
requested from the second provider, and 2) the provider could launch another
process and issue the request to the second provider as the intended user.
Background on connectLocal():
A local connection mechanism exists for clients to communicate with the
CIM Server on the same system. The connectLocal() function is used for this
purpose, and does not take any arguments. In the case where PEGASUS_LOCAL_DOMAIN_SOCKET
is defined, (default on all but Windows, as currently the Windows connectLocal
authentication is not functional as of 2.5) the user ID passed to the provider
is that of the process in which the client program is running. The CIM Server
verifies that the user ID of the request is indeed that of the requesting
process. Namespace authorization, if enabled, is still performed.
When the client must be able to connect to a CIM Server on a remote system,
or when it must be able to specify a different user than that of the process,
it must use the connect() function. This function allows a hostname and
port number to be specified, as well as a username and password. If
you need to use the connect() API, the WBEM client has several responsibilities
to ensure correct authentication and to protect confidential information.
Because connectLocal() does not use SSL, these guidelines only apply to
the connect() interface. Using connectLocal() bypasses these requirements
except where PEGASUS_LOCAL_DOMAIN_SOCKET is not defined. In that case,
it behaves like connect(), using HTTPS and/or HTTP as defined in Pegasus
settings.
- General programming standards
- Design for belligerent input. A separate module should be responsible
for validating all input before taking any action. Invalid input
should be discarded. If you client has high availability requirements,
deal with invalid input quickly to avoid Denial of Service attacks.
- Use a strongly-typed language if possible (i.e. Java). If your
client is in C++, then use a security scanner such as RATS (http://www.securesoftware.com/resources/download_rats.html)
to identify problem areas and follow the recommendations. (Note:
code scanners such as these tend to make a lot of recommendations, so
plan on adequate time for manual analysis and focus on your input validation
module.)
- Do not use world-writable files or directories (including /tmp and /var/tmp).
Make sure all credentials (passwords/certificates) are readable only by
their owner.
- Do not cache passwords unless directed to do so by the user. The
user should be aware that their password is being stored permanently on
the client machine.
- Do not pass passwords as an option on the command-line on non-windows
systems. Command-lines on non-windows systems are visible to all
users on the system.
- If possible, do not make any server-initiated changes on the client
system. Doing so increases the risk of security vulnerabilities
in your client, and a security reviewer should be consulted.
- If possible, log events of interest, including certificate warning messages
and invalid responses sent from the server. Doing so increases the ability
of a user or system administrator to track down unauthorized actions.
Use either a user-specific logfile or syslog. Be sure to check for
corner cases like disk-space limitations.
- HTTP Indications should only be used to send confidential information
in environments where the risk of exposure to man-in-the-middle type attacks
is low (e.g. where a rogue CIM Listener could intercept indications).
If your listener expects to receive confidential information, be sure
to document that this information will be visible to anyone on the network
clearly to the customer initiating the subscription.
- Security Testing Guidelines
- Run the following tests, and ensure that your client gives a useful
error message and does not crash. Crashes on strange and unexpected
input are, at a minimum, a denial-of-service, and often represent buffer
or format-string vulnerabilities.:
- CIM server you are connecting to is not available (disabled or network
problems)
- CIM server responds with an extremely large response
- CIM server or provider responds with invalid characters or garbage
in the response
- CIM server returns ‘access denied’
Platform Considerations
The coding guidelines may not help, but will not hurt implementations where
Pegasus and its providers are not run at elevated privilege. Examples of this
include environments with only one user or where Pegasus itself is executed
as the requesting user.
Copyright (c) 2005 EMC Corporation; Hewlett-Packard Development
Company, L.P.; IBM Corp.; The Open Group; VERITAS Software Corporation
Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation files (the
"Software"), to deal in the Software without restriction, including without
limitation the rights to use, copy, modify, merge, publish, distribute,
sublicense, and/or sell copies of the Software, and to permit persons to whom
the Software is furnished to do so, subject to the following conditions:
THE ABOVE COPYRIGHT NOTICE AND THIS PERMISSION NOTICE SHALL BE INCLUDED IN ALL
COPIES OR SUBSTANTIAL PORTIONS OF THE SOFTWARE. THE SOFTWARE IS PROVIDED
"AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT
LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE
AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF
CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.