«Workshop held 9-10 January 2008 in Arlington, VA Prepared for US Strategic Command Global Innovation and Strategy Center (USSTRATCOM/GISC) Prepared ...»
The researchers performed several simulations using the ASAM tool, including one which utilized the truck bombing HMM shown in Figure 7. They generated synthetic transaction data to simulate the events corresponding to the truck bombing plot, and mixed the data with background “noise” transactions. The simulation investigated the “likelihood of observations” result, which corresponds to the probability the HMM reports to its DBN node (Figure 10b), and the global probability of a terrorist attack for the abridged DBN shown in Figure 10a. The results of the simulation indicate that the global probability of a terrorist attack peaks at 48%, which could be interpreted by an analyst as likely. The developers of the ASAM system assert that it is intended to provide analysts with “soft alerts rather than hard decisions.” False positives are an inevitable result, but they can be minimized by obtaining accurate model parameters from the input of multiple intelligence analysts. Future work includes the addition of feedback capabilities via influence diagrams, which will allow analysts to simulate the impact of counter-terrorism measures on the threat level.
The ASAM system is also a component of a larger collaborative tool for counter-terrorism analysis called the Network Modeling Environment for Structural Intervention Strategies (NEMESIS). The NEMESIS environment “provides a forum for information exchange among multiple modeling or analysis tools, and model-based team collaboration” (Popp et al, 2004).
The platform utilizes Organizational Descriptive Language (ODL), which allows users to experiment graphically with different types of models. In addition to ASAM, NEMESIS incorporates the Organizational Risk Analysis (ORA) tool developed by researchers at Carnegie Mellon University. ORA is a “network tool that detects risks and vulnerabilities in an organization’s design structure,” which makes it useful for simulation of terrorist network destabilization strategies, such as those discussed below (Popp et al, 2004).
Deterring VNSA in Cyberspace
Figure 10a (Above): Simplified global threat model used in simulation of impact of truck bombing attack Figure 10b (Right): Likelihood of observation result from the application of the truck bombing HMM (top), and corresponding global probability of terrorist attack (bottom) (Singh, Tu et al, 2004).
Predictive Modeling: Destabilization of Terrorist Networks
The modeling techniques discussed so far have approached the problem of modeling the emergent behavior of VNSAs in cyberspace from a passive perspective, excluding analysis of the impact counter-terrorism measures may have on the group’s ability to function. This type of analysis is currently being investigated by researchers at the Carnegie Mellon center for Computational Analysis of Social and Organizational Systems (CASOS). They propose a Dynamic Network Analysis (DNA) approach, which “extends the power of thinking about networks to the realm of large-scale, dynamic systems with multiple co-evolving networks under conditions of information uncertainty with cognitively realistic agents” (Carley et al, 2003). The approach takes into account the dynamic and covert nature of terrorist networks, which are generally composed of semi-autonomous cells as opposed to hierarchical structures. Attempting to destabilize terrorist networks using strategies developed for well-defined, hierarchical networks will not be effective, hence the development of the DNA approach.
The group approached the problem by focusing on three key questions: “What is the size and shape of the covert network?; ”How does the nation in which the covert network exists impact its form and ability?”; and “If we do x to the covert network, what is likely to happen?” The approach they developed (Carley et al, 2003) utilized the following seven step process for
assessing various destabilization strategies:
1. Identify key network entities and connections between them.
2. Identify key processes by which entities or connections are added or dropped, or in the case of connections, changed their strength.
3. Collect data on the covert network.
Deterring VNSA in Cyberspace
4. Determine performance characteristics of the existing system.
5. Determine performance characteristics of the optimal system, if applicable.
6. Locate vulnerabilities in the network and select destabilization strategies.
7. Determine performance characteristics in the short and long term after destabilization strategy has been applied.
The initial testbed for this methodology was composed of open-source data describing the terrorist network associated with the embassy bombing in Tanzania. Generally, the group defines “entities” as people (agents), knowledge, resources, events, tasks, groups, and countries, but for this analysis a simplified set was used that consisted of people, resources, and tasks.
The performance of the system was simulated using the software DyNet, also developed by the CASOS group. DyNet is a “multi-agent network system for assessing destabilization strategies on dynamic networks,” and input to the system is a “knowledge network” composed of the “individuals’ knowledge about whom they know, what resources they have, and what task they are doing” (Carley et al, 2003). The group also assessed the efficiency of the network’s structure by comparing it to its optimal configuration, which was ascertained by minimizing the vulnerabilities caused by workload and distribution of resources and communication ties. The results indicate that the organization was not particularly well-designed since it required 88 changes to “who is doing what and has what resources to reach the optimal configuration.” It was also noted, however, that these results could indicate that a substantial amount of information on the organization’s structure is missing.
Next, the impact of four destabilization strategies on the performance of the network was assessed. The strategies included elimination of the person with the highest degree of Centrality, Betweenness Centrality, Cognitive Load, or Task Exclusivity. (Betweenness and Centrality are defined above in the discussion of link/social network analysis). The simulation was a two-step process, beginning with the use of the Organizational Risk Assessment (ORA) tool developed by the CASOS group, which evaluated the “resource congruence” of the group with and without the individuals high in these measures. DyNet was then applied to the altered networks, and the performance was evaluated for changes in the ease and rate of communication flow, and the ability of the organization to adapt to these changes. Table 4 lists the results of the assessment
for the two agents whose removal had the largest impact on the performance of the network:
agents 5 and 7. The results indicate that the removal of either agent does not significantly affect the network’s distance from the “optimal” configuration, so the researchers conclude that the effects of either removal in this case would be small.
Some of the results appear incongruous. For example, removal of agent 5 actually increases the resource congruence of the network, which is not exactly an expected outcome for removal of an important node. However, the researchers explain that “resource congruence is a strict measure such that congruence is decreased when either agents do not have the resources they needed for the task to which they are assigned or when agents have resources that are not necessary for the task they are assigned. Removal of agent 5 is reducing the presence of unnecessary resources … [making] the organizational design leaner.” The diffusion results, which indicate the rate and ease with which information can be spread throughout the network, are more intuitive. Removal of agent 7 is disruptive to the flow of communication because it decreases the potential diffusion rate. In contrast, removal of agent 5 actually increases the potential rate of communication Deterring VNSA in Cyberspace between nodes. The researchers point out that this “potentially makes the organization more vulnerable to information warfare attacks,” since both correct and incorrect information can be disseminated more rapidly as a result of removing this agent.
The results of this study demonstrate the potential of this methodology, but it is clear that much future work is needed to perfect the process. As emphasized by the CASOS group, it is important to take into account the fact that covert network assessments will be ill-informed and constantly changing, and the lack of complete information on the structure of the Tanzania terrorist network is a likely cause of the somewhat inconclusive simulation results. These issues further emphasize the importance of having a content-rich dataset that can adequately inform predictive models.
Benefits, Challenges, and Caveats
The terrorist attacks on September 11, 2001, “spurred extraordinary efforts intended to protect America from the newly highlighted scourge of international terrorism” (Jonas and Harper, 2006). These efforts included a significant interest in the potential use of predictive data mining techniques as a means of uncovering covert terrorist networks and plots, and since then, the implementation of such techniques has been surrounded by controversy. According to the National Commission on Terrorist Attacks upon the United States, if the government had pursued the leads available at the time, the attacks could have been prevented. This raises the question: Could data mining and predictive modeling techniques have played a role in averting the tragedy? According to a report by Jeff Jonas and Jim Harper for Policy Analysis (Jonas and Harper, 2006), the answer to that question is no. They describe data mining as “not well-suited to the terrorist discovery problem,” and they have defined data mining as “the process of searching data for previously unknown patterns and using those patterns to predict future outcomes.” In particular, they do not feel that predictive data mining would have made an impact on preventing 9/11. They assert that what law enforcement officials needed was not new technology, but “a sharper focus and perhaps the ability to more efficiently locate, access, and aggregate information about specific suspects.” The report also emphasizes the high likelihood of false positives - cases where individuals are incorrectly classified as “suspicious” due to some combination of activities that correlated with a Deterring VNSA in Cyberspace “terrorist-behavior” pattern. They cite the use of predictive data mining in consumer direct marketing campaigns that utilize demographic profiles of potential customers to target mailings to individuals that are statistically likely to buy certain products. Despite having access to millions of customer profiles to train their algorithms, the positive response rate for this type of advertising is in the single digits, corresponding to a minimum 90% false positive rate (Direct Marketing Assoc., 2004). In comparison, terror-related plots are much smaller in number, with “only one or two major terrorist incidents every few years - each one distinct in terms of planning and execution” (Jonas and Harper, 2006). This lack of historical data prohibits the creation of valid predictive models, opening the door to the possibility of an overwhelming number of false positives that would waste valuable financial and law enforcement resources.
While Jonas and Harper strongly disagree with the use of predictive data mining for the detection of covert terrorist plots and networks, their opinions are not contrary to what most researchers believe to be the limitations, realistic expectations, and proper application of predictive modeling techniques. The consensus is that these techniques should only be expected to produce meaningful results if they are well-informed, particularly by seed information from outside authoritative sources (Last, 2005). Predictive modeling should be used as a “power tool for analysts and investigators - a way to conduct low-level tasks that will provide clues to assist analysts and investigators” (DeRosa, 2004).
Another controversial issue surrounding the use of predictive data mining techniques is their potential to infringe on individuals’ privacy if not executed in a responsible manner. If “data mining or automated data analysis…is deemed acceptable given the potential harm of catastrophic terrorism,…there will be great temptation to expand to use of [the] tools” to other high profile illegal behavior - a phenomenon known as “mission creep” (DeRosa, 2004). Experts propose the implementation of a four step plan “designed to protect privacy and prevent abuse,” should the government gain access to large databases of private information (DeRosa, 2004).
The plan consists of: 1) Developing technology to address inaccurate data and false positives;
2) Developing technology designed to “mask or selectively reveal identifying data”; 3) Implementing audit technology; and 4) Implementing “permissioning” technology.
As mentioned previously, reducing false-positives can be accomplished by utilizing “cleaner” datasets and perfecting the models used for pattern-based analyses. Anonymization will provide analysts with access to identifying information, such as names, addresses, and social security numbers, on a need-to-know basis only. Audit technology is a secondary level of defense intended to “watch the watchers;” that is, protect against authorized users with access to identifying information who would abuse their authority. Finally, permissioning technology involves the implementation of rule-based processing where policies are built directly into the search engines that have access to private data. Users would be required to present evidence of permission to access content, such as a warrant, and the system would automatically grant access to only that content (DeRosa, 2004).
Research and Development Directions