commit 4923a44d3750c83d729b64aef0b6c16cf2a1ccc1 · hailey.at/atproto-ruleset

+214

README.md

··· 1 + # Osprey ATProto Ruleset 2 + 3 + This is a ruleset for [Osprey](https://github.com/roostorg/osprey) for use with ATProto, and specifically for [Bluesky](https://bsky.app). It is the ruleset that is used live on 4 + the [labeler that I personally run](https://bsky.app/profile/labeler.hailey.at). It may be used in conjunction with [my fork of Osprey](https://github.com/haileyok/osprey), which 5 + has implemented various components required for these rules (ATProto labels output sink, ML model calling, Redis counter system, etc). 6 + 7 + ## Using This Ruleset 8 + 9 + The easiest way to get started with these rules is to clone them into your Osprey rules directory, wherever that is located. For example, cloning down the official (or forked) 10 + version of Osprey will leave you with an `example_rules/` directory. Replacing the contents of that directory with this repository's contents will allow these rules to run. 11 + 12 + Again, note that you will need to have the required sinks and UDFs that these rules required, which are maintained inside of my Osprey fork for the time being. 13 + 14 + # Writing Rules 15 + 16 + > [!NOTE] 17 + > This documentation is a WIP for _generic_ rule writing, not ATProto specific rules. More documentation will come for ATProto specific rules. 18 + 19 + Osprey rules are written in SML, a sort of subset of Python (think Starlark). You can write rules that are specific to certain types of events that happen on a network or rules that take effect regardless of event type, depending on the type of behavior or patterns you are looking for. 20 + 21 + ## Structuring Rules 22 + 23 + You will likely find it useful to maintain two subdirectories inside of your main rules directory - a `rules` directory where actual logic will be added and a `models` directory for defining the various features that occur in any or specific event types. For example, your structure may look something like this: 24 + 25 + ```bash 26 + example-rules/ 27 + | rules/ 28 + | | record/ 29 + | | | post/ 30 + | | | | first_post_link.sml 31 + | | | | index.sml 32 + | | | like/ 33 + | | | | like_own_post.sml 34 + | | | | index.sml 35 + | | account/ 36 + | | | signup/ 37 + | | | | high_risk_signup.sml 38 + | | | | index.sml 39 + | | index.sml 40 + | models/ 41 + | | record/ 42 + | | | post.sml 43 + | | | like.sml 44 + | | account/ 45 + | | | signup.sml 46 + | main.sml 47 + ``` 48 + 49 + This sort of structure lets you define rules and models that are specific to certain event types so that only the necessary rules are run for various event types. For example, you likely have some rules that should only be run on a `post` event, since only a `post` will have features like `text` or `mention_count`. 50 + 51 + Inside of each directory, you may maintain an `index.sml` file that will define the conditional logic in which the rules inside that directory are actually included for execution. Although you could handle all of this conditional logic inside of a single file, maintaining separate `index.sml`s per directory greatly helps with neat organization. 52 + 53 + ## Models 54 + 55 + Before you actually write a rule, you’ll need to define a “model” for an event type. For this example, we will assume that you run a social media website that lets users create posts, either at the “top level” or as a reply to another top level post. Each post may include text, mentions of other users on your network, and an optional link embed in the post. Let’s say that the event’s JSON structure looks like this: 56 + 57 + ```json 58 + { 59 + "eventType": "userPost", 60 + "user": { 61 + "userId": "user_id_789", 62 + "handle": "carol", 63 + "postCount": 3, 64 + "accountAgeSeconds": 9002 65 + }, 66 + "postId": "abc123xyz", 67 + "replyId": null, 68 + "text": "Is anyone online right now? @alice or @bob, you there? If so check this video out", 69 + "mentionIds": ["user_id_123", "user_id_456"], 70 + "embedLink": "https://youtube.com/watch?id=1" 71 + } 72 + ``` 73 + 74 + Inside of our `models/record` directory, we should now create a `post.sml` file where we will define the features for a post. 75 + 76 + ```python 77 + PostId: Entity[str] = EntityJson( 78 + type='PostId', 79 + path='$.postId', 80 + ) 81 + 82 + PostText: str = JsonData( 83 + path='$.text', 84 + ) 85 + 86 + MentionIds: List[str] = JsonData( 87 + path='$.mentionIds', 88 + ) 89 + 90 + EmbedLink: Optional[str] = JsonData( 91 + path='$.embedLink', 92 + required=False, 93 + ) 94 + 95 + ReplyId: Entity[str] = JsonData( 96 + path='$.replyId', 97 + required=False, 98 + ) 99 + ``` 100 + 101 + The `JsonData` UDF (more on UDFs to follow) lets us take the event’s JSON and define features based on the contents of that JSON. These features can then be referenced in other rules that we import the `models/record/post.sml` model into. If you have any values inside your JSON object that may not always be present, you can set `required` to `False`, and these features will be `None` whenever the feature is not present. 102 + 103 + Note that we did not actually create any features for things like `userId` or `handle`. That is because these values will be present in *any* event. It wouldn’t be very nice to have to copy these features into each event type’s model. Therefore, we will actually create a `base.sml` model that defines these features which are always present. Inside of `models/base.sml`, let’s define these. 104 + 105 + ```python 106 + EventType = JsonData( 107 + path='$.eventType', 108 + ) 109 + 110 + UserId: Entity[str] = EntityJson( 111 + type='UserId', 112 + path='$.user.userId', 113 + ) 114 + 115 + Handle: Entity[str] = EntityJson( 116 + type='Handle', 117 + path='$.user.handle', 118 + ) 119 + 120 + PostCount: int = JsonData( 121 + path='$.user.postCount', 122 + ) 123 + 124 + AccountAgeSeconds: int = JsonData( 125 + path='$.user.accountAgeSeconds', 126 + ) 127 + ``` 128 + 129 + Here, instead of simply using `JsonData`, we instead use the `EntityJson` UDF. More on this later, but as a rule of thumb, you likely will want to have values for things like a user’s ID set to be entities. This will help more later, such as when doing data explorations within the Osprey UI. 130 + 131 + ### Model Hierarchy 132 + 133 + In practice, you may find it useful to create a hierarchy of base models: 134 + 135 + - `base.sml` for features present in every event (user IDs, handles, account stats, etc.) 136 + - `account_base.sml` for features that appear only in account related events, but always appear in each account related event. Similarly, you may add one like `record_base.sml` for those features which appear in all record events. 137 + 138 + This type of hierarchy prevents duplication (which Osprey does not allow) and ensures features are defined at the appropriate level of abstraction. 139 + 140 + ## Rules 141 + 142 + More in-depth documentation on rule writing can be found in `docs/WRITING-RULES.md`, however we’ll offer a brief overview here. 143 + 144 + Let's imagine we want to flag accounts whose first post mentions at least one user and includes a link. We’ll create a `.sml` file at `rules/record/post/first_post_link.sml` for our rules logic. This file will include both the conditions which will result in the rule evaluating to `True`, as well as the actions that we want to take if that rule does indeed evaluate to `True`. 145 + 146 + ```python 147 + # First, import the models that you will need inside of this rule 148 + Import( 149 + rules=[ 150 + 'models/base.sml', 151 + 'models/record/post.sml', 152 + ], 153 + ) 154 + 155 + # Next, define a variable that uses the `Rule` UDF 156 + FirstPostLinkRule = Rule( 157 + # Set the conditions in which this rule will be `True` 158 + when_all=[ 159 + PostCount == 1, # if this is the user's first post 160 + EmbedLink != None, # if there is a link inside of the post 161 + ListLength(list=MentionIds) >= 1, # if there is at least one mention in the post 162 + ], 163 + description='First post for user includes a link embed', 164 + ) 165 + 166 + # Finally, set which effect UDFs (more on this later) will be triggered 167 + WhenRules( 168 + rules_any=[FirstPostLinkRule], 169 + then=[ 170 + ReportRecord( 171 + entity=PostId, 172 + comment='This was the first post by a user and included a link', 173 + severity=3, 174 + ), 175 + ], 176 + ) 177 + ``` 178 + 179 + We also want to make sure this rule runs *only* whenever the event is a post event. Since we have a well defined project structure, this is pretty easy. We’ll start by modifying the `main.sml` at the project root to include a single, simple `Require` statement. 180 + 181 + ```bash 182 + Require( 183 + rule='rules/index.sml', 184 + ) 185 + ``` 186 + 187 + Next, inside of `rules/index.sml` we will define the conditions that result in post rules executing: 188 + 189 + ```bash 190 + Import( 191 + rules=[ 192 + 'models/base.sml', 193 + ], 194 + ) 195 + 196 + Require( 197 + rule='rules/record/post/index.sml', 198 + require_if=EventType == 'userPost', 199 + ) 200 + ``` 201 + 202 + Finally, inside of `rules/record/post/index.sml` we will require this new rule that we have written. 203 + 204 + ```bash 205 + Import( 206 + rules=[ 207 + 'models/base.sml', 208 + 'models/record/post.sml', 209 + ], 210 + ) 211 + 212 + Require( 213 + rule='rules/record/post/first_post_link.sml', 214 + )

+39

config/config.yaml

··· 1 + # For uris you can use {did}, {collection}, and {rkey} to parse at uris 2 + # For uris you can use {did}, {collection}, and {rkey} to parse at uris 3 + ui_config: 4 + default_summary_features: 5 + - actions: ['operation#*'] 6 + features: 7 + - UserId 8 + - Handle 9 + - DisplayName 10 + - Collection 11 + - AtUri 12 + - AccountCreatedAt 13 + - PdsHost 14 + - FollowersCount 15 + - FollowingCount 16 + - PostsCount 17 + - PostText 18 + - PostReplyRoot 19 + - PostReplyParent 20 + - PostExternalTitle 21 + - PostExternalDescription 22 + - PostExternalLink 23 + - SentimentScore 24 + - FollowSubjectDid 25 + - ListName 26 + - ListPurpose 27 + - ListitemList 28 + - ListitemSubjectDid 29 + - LikeSubject 30 + - LikeSubjectDid 31 + - RepostSubject 32 + - RepostSubjectDid 33 + - ProfileDisplayName 34 + - ProfileDescription 35 + 36 + # For uris you can use {did}, {collection}, and {rkey} to parse at uris 37 + external_links: 38 + UserId: 'https://bsky.app/profile/{entity_id}' 39 + AtUri: 'https://pdsls.dev/{entity_id}'

+57

config/labels.yaml

··· 1 + labels: 2 + men-facet-abuse: 3 + valid_for: [UserId] 4 + connotation: neutral 5 + description: Account has been abusing facet mentions 6 + mass-follow-mid: 7 + valid_for: [UserId] 8 + connotation: neutral 9 + description: Account has followed 300+ accounts in 30 minutes 10 + mass-follow-high: 11 + valid_for: [UserId] 12 + connotation: neutral 13 + description: Account has followed 1000+ accounts in 30 minutes 14 + shopping-spam: 15 + valid_for: [UserId] 16 + connotation: neutral 17 + description: Account has posted 15+ shopping links in 30 minutes 18 + inauth-fundraising: 19 + valid_for: [UserId] 20 + connotation: neutral 21 + description: Account is likely performing inauthentic fundraising 22 + reply-link-spam: 23 + valid_for: [UserId] 24 + connotation: neutral 25 + description: Account has replied with a link twenty or more times in a 24 hour period 26 + stpk-creations: 27 + valid_for: [UserId] 28 + connotation: neutral 29 + description: Account has made more than two starterpacks in a week 30 + some-blocks: 31 + valid_for: [UserId] 32 + connotation: neutral 33 + description: Account was blocked 20+ times in 24 hours 34 + mass-blocks: 35 + valid_for: [UserId] 36 + connotation: neutral 37 + description: Account was blocked 100+ times in 24 hours 38 + handle-changed: 39 + valid_for: [UserId] 40 + connotation: neutral 41 + description: Account has changed their handle recently. 42 + many-handle-chgs: 43 + valid_for: [UserId] 44 + connotation: neutral 45 + description: Account has changed their handle 3+ times in a 24 hour period. 46 + suss-handle-change: 47 + valid_for: [UserId] 48 + connotation: neutral 49 + description: Suspicious handle change 50 + new-acct-replies: 51 + valid_for: [UserId] 52 + connotation: neutral 53 + description: Account made 10+ replies in their first hour with low top level count. 54 + new-acct-slurs: 55 + valid_for: [UserId] 56 + connotation: neutral 57 + description: Account that is relatively new found to making slur posts.

+3

lists/fundraise_domains.yaml

··· 1 + - chuffed.org 2 + - gofundme.org 3 + - gofund.me

+6

lists/shopping.yaml

··· 1 + - amazon.com 2 + - a.co 3 + - amzn.com 4 + - dlvr.it 5 + - skystore.top 6 + - amzns.life

+21

lists/slurs.yaml

··· 1 + - nigers 2 + - troon 3 + - pzoverlord 4 + - janny 5 + - jannie 6 + - nigger 7 + - rapehon 8 + - trannies 9 + - retard 10 + - retarded 11 + - faggit 12 + - fagget 13 + - phaggot 14 + - phagget 15 + - kike 16 + - spic 17 + - chink 18 + - towelhead 19 + - towel head 20 + - transvestite 21 + - libtard

+5

lists/slurs_low.yaml

··· 1 + - tranny 2 + - faggot 3 + - nigga 4 + - subhuman 5 + - hermaphrodite

+4

main.sml

··· 1 + Import(rules=['models/base.sml']) 2 + 3 + Require(rule='rules/index.sml') 4 +

+89

models/base.sml

··· 1 + ActionName=GetActionName() 2 + 3 + UserId: Entity[str] = EntityJson( 4 + type='UserId', 5 + path='$.did', 6 + required=False, 7 + ) 8 + 9 + Handle: Entity[str] = EntityJson( 10 + type='Handle', 11 + path='$.eventMetadata.handle', 12 + required=False, 13 + ) 14 + 15 + PdsHost: Entity[str] = EntityJson( 16 + type='PdsHost', 17 + path='$.eventMetadata.pdsHost', 18 + required=False, 19 + ) 20 + 21 + DisplayName: str = JsonData( 22 + path='$.eventMetadata.profile.displayName', 23 + required=False, 24 + coerce_type=True, 25 + ) 26 + 27 + FollowersCount: int = JsonData( 28 + path='$.eventMetadata.profile.followersCount', 29 + required=False, 30 + coerce_type=True, 31 + ) 32 + 33 + FollowingCount: int = JsonData( 34 + path='$.eventMetadata.profile.followingCount', 35 + required=False, 36 + coerce_type=True, 37 + ) 38 + 39 + PostsCount: int = JsonData( 40 + path='$.eventMetadata.profile.postsCount', 41 + required=False, 42 + coerce_type=True, 43 + ) 44 + 45 + Avatar: Optional[str] = JsonData( 46 + path='$.eventMetadata.profile.avatar', 47 + required=False, 48 + ) 49 + 50 + Banner: Optional[str] = JsonData( 51 + path='$.eventMetadata.profile.banner', 52 + required=False, 53 + ) 54 + 55 + HasAvatar = Avatar != None 56 + 57 + HasBanner = Banner != None 58 + 59 + AccountCreatedAt: Optional[str] = JsonData( 60 + path='$.eventMetadata.didCreatedAt', 61 + required=False, 62 + ) 63 + 64 + AccountAgeSeconds: Optional[int] = JsonData( 65 + path='$.eventMetadata.accountAge', 66 + required=False, 67 + ) 68 + 69 + AccountAgeSecondsUnwrapped: int = ResolveOptional( 70 + optional_value=AccountAgeSeconds, 71 + default_value=999999999, 72 + ) 73 + 74 + OperationKind: Optional[str] = JsonData( 75 + path='$.operation.action', 76 + required=False, 77 + ) 78 + 79 + IsOperation = OperationKind != None 80 + 81 + 82 + Second: int = 1 83 + Minute: int = Second * 60 84 + FiveMinute: int = Minute * 5 85 + TenMinute: int = Minute * 10 86 + ThirtyMinute: int = Minute * 30 87 + Hour: int = Minute * 60 88 + Day: int = Hour * 24 89 + Week: int = Day * 7

+5

models/identity.sml

··· 1 + IdentityEventHandle: str = JsonData( 2 + path='$.identity.handle', 3 + required=False, 4 + coerce_type=True, 5 + )

+38

models/record/base.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + ], 5 + ) 6 + 7 + IsCreate = OperationKind == 'create' 8 + IsUpdate = OperationKind == 'update' 9 + IsDelete = OperationKind == 'delete' 10 + 11 + Collection: str = JsonData( 12 + path='$.operation.collection', 13 + ) 14 + 15 + Path: str = JsonData( 16 + path='$.operation.path', 17 + ) 18 + 19 + _UserIdResolved: str = ResolveOptional(optional_value=UserId) 20 + AtUri: Entity[str] = Entity( 21 + type='AtUri', 22 + id=f'at://{_UserIdResolved}/{Path}', 23 + ) 24 + 25 + Cid: str = JsonData( 26 + path='$.operation.cid', 27 + ) 28 + 29 + 30 + FacetLinkList: List[str] = LinksFromFacets() 31 + FacetLinkCount = ListLength(list=FacetLinkList) 32 + FacetLinkDomains = ExtractListDomains(list=FacetLinkList) 33 + 34 + FacetMentionList: List[str] = MentionsFromFacets() 35 + FacetMentionCount = ListLength(list=FacetMentionList) 36 + 37 + FacetTagList: List[str] = TagsFromFacets() 38 + FacetTagLength = ListLength(list=FacetTagList)

+5

models/record/block.sml

··· 1 + BlockSubjectDid: Entity[str] = EntityJson( 2 + type='UserId', 3 + path='$.operation.record.subject', 4 + coerce_type=True, 5 + )

+5

models/record/follow.sml

··· 1 + FollowSubjectDid: Entity[str] = EntityJson( 2 + type='UserId', 3 + path='$.operation.record.subject', 4 + coerce_type=True, 5 + )

+8

models/record/like.sml

··· 1 + LikeSubject: Entity[str] = EntityJson( 2 + type='AtUri', 3 + path='$.operation.record.subject.uri', 4 + required=True, 5 + coerce_type=True, 6 + ) 7 + 8 + LikeSubjectDid: Optional[str] = DidFromUri(uri=LikeSubject)

+9

models/record/list.sml

··· 1 + ListName: str = JsonData( 2 + path='$.operation.record.name', 3 + coerce_type=True, 4 + ) 5 + 6 + ListPurpose: str = JsonData( 7 + path='$.operation.record.purpose', 8 + coerce_type=True, 9 + )

+11

models/record/listitem.sml

··· 1 + ListitemSubjectDid: Entity[str] = EntityJson( 2 + type='UserId', 3 + path='$.operation.record.subject', 4 + coerce_type=True, 5 + ) 6 + 7 + ListitemList: Entity[str] = EntityJson( 8 + type='AtUri', 9 + path='$.operation.record.list', 10 + coerce_type=True, 11 + )

+106

models/record/post.sml

··· 1 + Import( 2 + rules=['models/base.sml'], 3 + ) 4 + 5 + PostText: str = JsonData( 6 + path='$.operation.record.text', 7 + required=False, 8 + coerce_type=True, 9 + ) 10 + 11 + PostTextCleaned: str = CleanString(s=PostText) 12 + 13 + PostTextTokens: List[str] = Tokenize( 14 + s=PostText, 15 + ) 16 + 17 + PostTextCleanedTokens: List[str] = Tokenize( 18 + s=PostTextCleaned, 19 + ) 20 + 21 + PostReplyParent: Entity[str] = EntityJson( 22 + type='AtUri', 23 + path='$.operation.record.reply.parent.uri', 24 + required=False, 25 + ) 26 + 27 + PostReplyParentDid: Optional[str] = DidFromUri(uri=PostReplyParent) 28 + 29 + PostIsSelfReply = UserId == PostReplyParentDid 30 + 31 + PostReplyRoot: Entity[str] = EntityJson( 32 + type='AtUri', 33 + path='$.operation.record.reply.root.uri', 34 + required=False, 35 + ) 36 + 37 + PostIsReply = PostReplyParent != None and PostReplyRoot != None 38 + 39 + _PostEmbedType: Optional[str] = JsonData( 40 + path="$.operation.record.embed.['$type']", 41 + required=False, 42 + ) 43 + 44 + _PostRecordWithMediaEmbedType: Optional[str] = JsonData( 45 + path="$.operation.record.embed.media.['$type']", 46 + required=False, 47 + ) 48 + 49 + PostHasImage = _PostEmbedType == 'app.bsky.embed.images' or (_PostEmbedType == 'app.bsky.embed.recordWithMedia' and _PostRecordWithMediaEmbedType == 'app.bsky.embed.images') 50 + 51 + PostHasVideo = _PostEmbedType == 'app.bsky.embed.video' or (_PostEmbedType == 'app.bsky.embed.recordWithMedia' and _PostRecordWithMediaEmbedType == 'app.bsky.embed.video') 52 + 53 + PostHasExternal = _PostEmbedType == 'app.bsky.embed.external' or (_PostEmbedType == 'app.bsky.embed.recordWithMedia' and _PostRecordWithMediaEmbedType == 'app.bsky.embed.external') 54 + 55 + PostExternalLink: Optional[str] = JsonData( 56 + path='$.operation.record.embed.external.uri', 57 + required=False, 58 + ) 59 + 60 + PostExternalTitle: Optional[str] = JsonData( 61 + path='$.operation.record.embed.external.title', 62 + required=False, 63 + ) 64 + 65 + PostExternalDescription: Optional[str] = JsonData( 66 + path='$.operation.record.embed.external.description', 67 + required=False, 68 + ) 69 + 70 + PostLanguages: List[str] = JsonData( 71 + path='$.operation.record.langs', 72 + coerce_type=True, 73 + required=False, 74 + ) 75 + 76 + PostTextDomains = ExtractDomains(s=PostText) 77 + 78 + PostAllDomains: List[str] = ConcatStringLists( 79 + lists=[ 80 + PostTextDomains, 81 + ExtractDomains(s=ForceString(s=PostExternalLink)), 82 + ], 83 + ) 84 + 85 + PostEmoji: List[str] = ExtractEmoji(s=PostText) 86 + 87 + SentimentScore: Optional[float] = AnalyzeSentiment(text=PostText, when_all=[ 88 + SimpleListContains( 89 + cache_name='sentiment_langs', 90 + list=['en'], 91 + phrases=PostLanguages, 92 + ) != None, 93 + ]) 94 + 95 + SentimentScoreUnwrapped: float = ResolveOptional(optional_value=SentimentScore, default_value=0.0) 96 + 97 + 98 + ToxicityScore: Optional[float] = AnalyzeToxicity(text=PostText, when_all=[ 99 + SimpleListContains( 100 + cache_name='sentiment_langs', 101 + list=['en'], 102 + phrases=PostLanguages, 103 + ) != None, 104 + ]) 105 + 106 + ToxicityScoreUnwrapped: float = ResolveOptional(optional_value=ToxicityScore, default_value=0.0)

+29

models/record/profile.sml

··· 1 + ProfileDisplayName: str = JsonData( 2 + path='$.operation.record.displayName', 3 + required=False, 4 + coerce_type=True, 5 + ) 6 + 7 + ProfileDisplayNameCleaned: str = CleanString(s=ProfileDisplayName) 8 + 9 + ProfileDescription: str = JsonData( 10 + path='$.operation.record.description', 11 + required=False, 12 + coerce_type=True, 13 + ) 14 + 15 + ProfileDescriptionCleaned: str = CleanString(s=ProfileDescription) 16 + 17 + ProfileDescriptionTokens: List[str] = Tokenize( 18 + s=ProfileDescription, 19 + ) 20 + 21 + ProfileDescriptionCleanedTokens: List[str] = Tokenize( 22 + s=ProfileDescriptionCleaned, 23 + ) 24 + 25 + ProfilePinnedPost: Entity[str] = EntityJson( 26 + type='Uri', 27 + path='$.operation.record.pinnedPost.uri', 28 + required=False, 29 + )

+8

models/record/repost.sml

··· 1 + RepostSubject: Entity[str] = EntityJson( 2 + type='AtUri', 3 + path='$.operation.record.subject.uri', 4 + required=True, 5 + coerce_type=True, 6 + ) 7 + 8 + RepostSubjectDid: Optional[str] = DidFromUri(uri=RepostSubject)

+10

models/record/starterpack.sml

··· 1 + StarterpackList: Entity[str] = EntityJson( 2 + type='AtUri', 3 + path='$.operation.record.list', 4 + coerce_type=True, 5 + ) 6 + 7 + StarterpackName: str = JsonData( 8 + path='$.operation.record.name', 9 + coerce_type=True, 10 + )

+8

rules/identity/index.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/identity.sml', 5 + ], 6 + ) 7 + 8 + Require(rule='rules/identity/update_handle.sml')

+67

rules/identity/update_handle.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/identity.sml', 5 + ], 6 + ) 7 + 8 + _Counter = IncrementWindow( 9 + key=f'handle-{UserId}', 10 + window_seconds=24*Hour, 11 + when_all=[AccountAgeSecondsUnwrapped >= 300], 12 + ) 13 + 14 + HandleChangedRule = Rule( 15 + when_all=[AccountAgeSecondsUnwrapped >= 300], 16 + description='User has updated their handle recently.', 17 + ) 18 + 19 + SussHandleChangedRule = Rule( 20 + when_all=[ 21 + AccountAgeSecondsUnwrapped >= 7 * Day, 22 + PostsCount <= 1, 23 + FollowingCount <= 10, 24 + ], 25 + description='Suspicious handle change', 26 + ) 27 + 28 + MultipleHandleChangesRule = Rule( 29 + when_all=[_Counter == 3], 30 + description='User has updated their handle 3+ times in a 24 hour period recently.', 31 + ) 32 + 33 + WhenRules( 34 + rules_any=[HandleChangedRule], 35 + then=[ 36 + AtprotoLabel( 37 + entity=UserId, 38 + label='handle-changed', 39 + comment='User has updated their handle recently.', 40 + expiration_in_hours=7 * 24, 41 + ), 42 + ], 43 + ) 44 + 45 + WhenRules( 46 + rules_any=[MultipleHandleChangesRule], 47 + then=[ 48 + AtprotoLabel( 49 + entity=UserId, 50 + label='many-handle-chgs', 51 + comment='User has updated their handle 3+ times in a 24 hour period recently.', 52 + expiration_in_hours=7 * 24, 53 + ), 54 + ], 55 + ) 56 + 57 + WhenRules( 58 + rules_any=[SussHandleChangedRule], 59 + then=[ 60 + AtprotoLabel( 61 + entity=UserId, 62 + label='suss-handle-change', 63 + comment='Suspicious handle change', 64 + expiration_in_hours=7*24, 65 + ), 66 + ], 67 + )

+15

rules/index.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + ], 5 + ) 6 + 7 + Require( 8 + rule='rules/record/index.sml', 9 + require_if=IsOperation, 10 + ) 11 + 12 + Require( 13 + rule='rules/identity/index.sml', 14 + require_if=ActionName == 'identity', 15 + )

+51

rules/record/block/blocked_a_lot.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/block.sml', 6 + ], 7 + ) 8 + 9 + _Count = IncrementWindow( 10 + key=f'blk-sbj-{BlockSubjectDid}', 11 + window_seconds=Day, 12 + when_all=[True], 13 + ) 14 + 15 + SomeBlocksRule = Rule( 16 + when_all=[ 17 + _Count == 20, 18 + ], 19 + description='Account was blocked 20 or more times in 24 hours', 20 + ) 21 + 22 + MassBlocksRule = Rule( 23 + when_all=[ 24 + _Count == 75, 25 + ], 26 + description='Account was blocked 100 or more times in 24 hours', 27 + ) 28 + 29 + WhenRules( 30 + rules_any=[SomeBlocksRule], 31 + then=[ 32 + AtprotoLabel( 33 + entity=BlockSubjectDid, 34 + comment='Account was blocked 20 ore more times in 24 hours', 35 + label='some-blocks', 36 + expiration_in_hours=3*24, 37 + ), 38 + ], 39 + ) 40 + 41 + WhenRules( 42 + rules_any=[MassBlocksRule], 43 + then=[ 44 + AtprotoLabel( 45 + entity=BlockSubjectDid, 46 + comment='Account was blocked 100 ore more times in 24 hours', 47 + label='mass-blocks', 48 + expiration_in_hours=7*24, 49 + ), 50 + ], 51 + )

+9

rules/record/block/index.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/block.sml', 6 + ], 7 + ) 8 + 9 + Require(rule='rules/record/block/blocked_a_lot.sml')

+9

rules/record/follow/index.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/follow.sml', 6 + ], 7 + ) 8 + 9 + Require(rule='rules/record/follow/mass_following.sml')

+58

rules/record/follow/mass_following.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/post.sml', 6 + ], 7 + ) 8 + 9 + _MFFollowersDiff = FollowingCount - FollowersCount 10 + 11 + _MFAgeBased = FollowersCount <= 100 or AccountAgeSecondsUnwrapped <= Day 12 + _MFDiffBased = _MFFollowersDiff >= 5000 13 + 14 + MassFollowingCount = IncrementWindow( 15 + key=f'mass-flw-ct-{UserId}', 16 + window_seconds = 30 * Minute, 17 + when_all=[ 18 + (_MFAgeBased or _MFDiffBased), 19 + ], 20 + ) 21 + 22 + MassFollowingMidRule = Rule( 23 + when_all=[ 24 + MassFollowingCount == 300, 25 + ], 26 + description='Followed 300+ in thirty minutes', 27 + ) 28 + 29 + MassFollowingHighRule = Rule( 30 + when_all=[ 31 + MassFollowingCount == 1000, 32 + ], 33 + description='Followed 1000+ in thirty minutes', 34 + ) 35 + 36 + WhenRules( 37 + rules_any=[MassFollowingMidRule], 38 + then=[ 39 + AtprotoLabel( 40 + entity=UserId, 41 + comment='Followed 300+ in thirty minutes', 42 + label='mass-follow-mid', 43 + expiration_in_hours=24, 44 + ), 45 + ], 46 + ) 47 + 48 + WhenRules( 49 + rules_any=[MassFollowingHighRule], 50 + then=[ 51 + AtprotoLabel( 52 + entity=UserId, 53 + comment='Followed 1000+ in thirty minutes', 54 + label='mass-follow-high', 55 + expiration_in_hours=None, 56 + ), 57 + ], 58 + )

+51

rules/record/index.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + ], 6 + ) 7 + 8 + Require( 9 + rule='rules/record/post/index.sml', 10 + require_if=(IsCreate or IsUpdate) and Collection == 'app.bsky.feed.post', 11 + ) 12 + 13 + Require( 14 + rule='rules/record/like/index.sml', 15 + require_if=(IsCreate or IsUpdate) and Collection == 'app.bsky.feed.like', 16 + ) 17 + 18 + Require( 19 + rule='rules/record/follow/index.sml', 20 + require_if=(IsCreate or IsUpdate) and Collection == 'app.bsky.graph.follow', 21 + ) 22 + 23 + Require( 24 + rule='rules/record/list/index.sml', 25 + require_if=(IsCreate or IsUpdate) and Collection == 'app.bsky.graph.list', 26 + ) 27 + 28 + Require( 29 + rule='rules/record/listitem/index.sml', 30 + require_if=(IsCreate or IsUpdate) and Collection == 'app.bsky.graph.listitem', 31 + ) 32 + 33 + Require( 34 + rule='rules/record/repost/index.sml', 35 + require_if=(IsCreate or IsUpdate) and Collection == 'app.bsky.feed.repost', 36 + ) 37 + 38 + Require( 39 + rule='rules/record/starterpack/index.sml', 40 + require_if=(IsCreate or IsUpdate) and Collection == 'app.bsky.graph.starterpack', 41 + ) 42 + 43 + Require( 44 + rule='rules/record/block/index.sml', 45 + require_if=(IsCreate or IsUpdate) and Collection == 'app.bsky.graph.block', 46 + ) 47 + 48 + Require( 49 + rule='rules/record/profile/index.sml', 50 + require_if=(IsCreate or IsUpdate) and Collection == 'app.bsky.actor.profile', 51 + )

+7

rules/record/like/index.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/like.sml', 6 + ], 7 + )

+7

rules/record/list/index.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/list.sml', 6 + ], 7 + )

+7

rules/record/listitem/index.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/listitem.sml', 6 + ], 7 + )

+31

rules/record/post/inauthentic_fundraising.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/post.sml', 6 + ], 7 + ) 8 + 9 + InauthFundraisingPostRule = Rule( 10 + when_all=[ 11 + AccountAgeSecondsUnwrapped <= 3 * Day, 12 + PostsCount <= 5, 13 + ListContains( 14 + list='fundraise_domains', 15 + phrases=PostAllDomains, 16 + ) != None, 17 + ], 18 + description='Account likely performing inauthentic fundraising', 19 + ) 20 + 21 + WhenRules( 22 + rules_any=[InauthFundraisingPostRule], 23 + then=[ 24 + AtprotoLabel( 25 + entity=UserId, 26 + label='inauth-fundraising', 27 + comment='Account likely performing inauthentic fundraising', 28 + expiration_in_hours=24*7, 29 + ), 30 + ], 31 + )

+25

rules/record/post/index.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/post.sml', 6 + ], 7 + ) 8 + 9 + Require(rule='rules/record/post/post_contains_hello.sml') 10 + Require(rule='rules/record/post/mention_facet_abuse.sml') 11 + Require(rule='rules/record/post/shopping_spam.sml') 12 + Require(rule='rules/record/post/inauthentic_fundraising.sml') 13 + Require(rule='rules/record/post/new_account_slurs.sml') 14 + Require(rule='rules/record/post/negative_posting.sml') 15 + Require(rule='rules/record/post/toxic_posting.sml') 16 + 17 + # Replies Only 18 + Require( 19 + rule='rules/record/post/reply_link_spam.sml', 20 + require_if=PostIsReply and PostExternalLink != None, 21 + ) 22 + Require( 23 + rule='rules/record/post/new_account_replies.sml', 24 + require_if=PostIsReply, 25 + )

+36

rules/record/post/mention_facet_abuse.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/post.sml', 6 + ], 7 + ) 8 + 9 + _IsAbusingFacets = (FacetMentionCount >= 20 and (FollowersCount <= 5 or PostsCount <= 5)) or FacetMentionCount >= 30 10 + _BlackskyFacetAbuse = PdsHost == 'https://blacksky.app' and FacetMentionCount >= 2 and '@' not in PostText 11 + 12 + MentionFacetAbuseRule = Rule( 13 + when_all=[ 14 + _IsAbusingFacets, 15 + ], 16 + description='Account participating in facet mention abuse', 17 + ) 18 + 19 + BlackskyFacetAbuseRule = Rule( 20 + when_all=[ 21 + _BlackskyFacetAbuse, 22 + ], 23 + description='Account participating in facet mention abuse on Blacksky', 24 + ) 25 + 26 + WhenRules( 27 + rules_any=[MentionFacetAbuseRule, BlackskyFacetAbuseRule], 28 + then=[ 29 + AtprotoLabel( 30 + entity=UserId, 31 + label='men-facet-abuse', 32 + comment='Account participating in facet mention abuse', 33 + expiration_in_hours=None, 34 + ), 35 + ], 36 + )

+54

rules/record/post/negative_posting.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/post.sml', 6 + ], 7 + ) 8 + 9 + _Gate = SentimentScoreUnwrapped <= -0.8 10 + 11 + NegativeSentimentCount = IncrementWindow( 12 + key=f'neg-post-{UserId}', 13 + window_seconds=4*Hour, 14 + when_all=[_Gate], 15 + ) 16 + 17 + NegativePostRule = Rule( 18 + when_all=[ 19 + # Purposefully lower than the gate 20 + SentimentScoreUnwrapped <= -0.85, 21 + PostIsReply, 22 + ], 23 + description='This post is negative', 24 + ) 25 + 26 + NegativePostingRule = Rule( 27 + when_all=[NegativeSentimentCount >= 3], 28 + description='User has made five or more negative posts in a four hour window', 29 + ) 30 + 31 + WhenRules( 32 + rules_any=[NegativePostRule], 33 + then=[ 34 + AtprotoLabel( 35 + entity=AtUri, 36 + label='negative-post', 37 + comment='This post is negative', 38 + expiration_in_hours=None, 39 + cid=Cid, 40 + ), 41 + ], 42 + ) 43 + 44 + WhenRules( 45 + rules_any=[NegativePostingRule], 46 + then=[ 47 + AtprotoLabel( 48 + entity=UserId, 49 + label='negative-poster', 50 + comment='This user made five or more negative posts in four hours', 51 + expiration_in_hours=2 * Day, 52 + ), 53 + ], 54 + )

+39

rules/record/post/new_account_replies.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/post.sml', 6 + ], 7 + ) 8 + 9 + _Gate = AccountAgeSecondsUnwrapped <= Hour and not PostIsSelfReply 10 + 11 + _ReplyCount = IncrementWindow( 12 + key=f'new-acc-rep-{UserId}', 13 + window_seconds=Hour, 14 + when_all=[_Gate], 15 + ) 16 + 17 + _TopLevelMinusReplies = PostsCount - _ReplyCount 18 + 19 + NewAccountRepliesRule = Rule( 20 + when_all=[ 21 + _Gate, 22 + # If the user is mostly just making replies, then we label 23 + _TopLevelMinusReplies < 2, 24 + _ReplyCount == 10, 25 + ], 26 + description='Account made 10+ replies in their first hour with low top-level post count', 27 + ) 28 + 29 + WhenRules( 30 + rules_any=[NewAccountRepliesRule], 31 + then=[ 32 + AtprotoLabel( 33 + entity=UserId, 34 + label='new-acct-replies', 35 + comment='Account made 10+ replies in their first hour with low top-level post count', 36 + expiration_in_hours=7*24, 37 + ), 38 + ], 39 + )

+62

rules/record/post/new_account_slurs.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/post.sml', 6 + ], 7 + ) 8 + 9 + _InitialGate = AccountAgeSecondsUnwrapped <= 7 * Day or PostsCount <= 25 or FollowersCount <= 5 10 + 11 + _ContainsSlurHigh = CensorizedListMatch( 12 + list='slurs', 13 + plurals=True, 14 + phrases=PostTextCleanedTokens, 15 + ) != None 16 + 17 + _ContainsSlurLow = CensorizedListMatch( 18 + list='slurs_low', 19 + plurals=True, 20 + phrases=PostTextCleanedTokens, 21 + ) != None 22 + 23 + _HighSlursCount = IncrementWindow( 24 + key=f'slur-high-{UserId}', 25 + window_seconds=Hour, 26 + when_all=[ 27 + _InitialGate, 28 + _ContainsSlurHigh, 29 + ], 30 + ) 31 + 32 + _LowSlursCount = IncrementWindow( 33 + key=f'slur-low-{UserId}', 34 + window_seconds=Hour, 35 + when_all=[ 36 + _InitialGate, 37 + _ContainsSlurLow, 38 + ], 39 + ) 40 + 41 + _IsVeryNewAccount = (AccountAgeSecondsUnwrapped <= 1 * Day or PostsCount <= 10 or FollowersCount <= 5) 42 + 43 + _LabelGateVNA = _IsVeryNewAccount and (_LowSlursCount == 2 or _HighSlursCount == 1) 44 + _LabelGateOther = _LowSlursCount == 4 or _HighSlursCount == 1 45 + _LabelGate = _LabelGateVNA or _LabelGateOther 46 + 47 + NewAccountSlursRule = Rule( 48 + when_all=[_InitialGate, _LabelGate], 49 + description='New account found to be using slurs.', 50 + ) 51 + 52 + WhenRules( 53 + rules_any=[NewAccountSlursRule], 54 + then=[ 55 + AtprotoLabel( 56 + entity=UserId, 57 + label='new-acct-slurs', 58 + comment='New account found to be using slurs', 59 + expiration_in_hours=7 * Day, 60 + ), 61 + ], 62 + )

+14

rules/record/post/post_contains_hello.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/post.sml', 6 + ], 7 + ) 8 + 9 + PostContainsHelloRule = Rule( 10 + when_all=[ 11 + 'hello' in StringToLower(s=PostText), 12 + ], 13 + description='Post text contains hello', 14 + )

+35

rules/record/post/reply_link_spam.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/post.sml', 6 + ], 7 + ) 8 + 9 + _Gate = PostIsReply and PostExternalLink != None 10 + 11 + _ReplyLinkCount = IncrementWindow( 12 + key=f'reply-link-{UserId}', 13 + window_seconds=Day, 14 + when_all=[_Gate], 15 + ) 16 + 17 + ReplyLinkSpamRule = Rule( 18 + when_all=[ 19 + _Gate, 20 + _ReplyLinkCount == 20, 21 + ], 22 + description='Account has replied with a link 20+ times in 24 hours', 23 + ) 24 + 25 + WhenRules( 26 + rules_any=[ReplyLinkSpamRule], 27 + then=[ 28 + AtprotoLabel( 29 + entity=UserId, 30 + label='reply-link-spam', 31 + comment='Account has replied with a link 20+ times in 24 hours', 32 + expiration_in_hours=24*7, 33 + ), 34 + ], 35 + )

+39

rules/record/post/shopping_spam.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/post.sml', 6 + ], 7 + ) 8 + 9 + _HasShoppingDomain = ListContains( 10 + list='shopping', 11 + phrases=PostAllDomains, 12 + ) 13 + 14 + ShoppingDomainCount = IncrementWindow( 15 + key=f'mass-flw-ct-{UserId}', 16 + window_seconds = 30 * Minute, 17 + when_all=[ 18 + _HasShoppingDomain != None, 19 + ], 20 + ) 21 + 22 + ShoppingSpamRule = Rule( 23 + when_all=[ 24 + ShoppingDomainCount == 15, 25 + ], 26 + description='Account posted a shopping link 15+ times in 30 minutes', 27 + ) 28 + 29 + WhenRules( 30 + rules_any=[ShoppingSpamRule], 31 + then=[ 32 + AtprotoLabel( 33 + entity=UserId, 34 + label='shopping-spam', 35 + comment='Account posted a shopping link 15+ times in 30 minutes', 36 + expiration_in_hours=None, 37 + ), 38 + ], 39 + )

+53

rules/record/post/toxic_posting.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/post.sml', 6 + ], 7 + ) 8 + 9 + _Gate = ToxicityScoreUnwrapped <= -0.997 10 + 11 + ToxicPostCount = IncrementWindow( 12 + key=f'tox-post-{UserId}', 13 + window_seconds=4*Hour, 14 + when_all=[_Gate], 15 + ) 16 + 17 + ToxicPostRule = Rule( 18 + when_all=[ 19 + _Gate, 20 + PostIsReply, 21 + ], 22 + description='This post is toxic', 23 + ) 24 + 25 + ToxicPostingRule = Rule( 26 + when_all=[ToxicPostCount >= 3], 27 + description='User has made three or more toxic posts in a four hour window', 28 + ) 29 + 30 + WhenRules( 31 + rules_any=[ToxicPostRule], 32 + then=[ 33 + AtprotoLabel( 34 + entity=AtUri, 35 + label='toxic-post', 36 + comment='This post is toxic', 37 + expiration_in_hours=None, 38 + cid=Cid, 39 + ), 40 + ], 41 + ) 42 + 43 + WhenRules( 44 + rules_any=[ToxicPostingRule], 45 + then=[ 46 + AtprotoLabel( 47 + entity=UserId, 48 + label='toxic-poster', 49 + comment='This user made three or more toxic posts in four hours', 50 + expiration_in_hours=2 * Day, 51 + ), 52 + ], 53 + )

+28

rules/record/profile/hailey_profile.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/profile.sml', 6 + ], 7 + ) 8 + 9 + HaileyProfileRule = Rule( 10 + when_all=[ 11 + UserId == 'did:plc:oisofpd7lj26yvgiivf3lxsi', 12 + ], 13 + description='Hailey updated her profile', 14 + ) 15 + 16 + WhenRules( 17 + rules_any=[ 18 + HaileyProfileRule, 19 + ], 20 + then=[ 21 + AtprotoLabel( 22 + entity=UserId, 23 + label='hailey', 24 + comment='Hailey updated her profile', 25 + expiration_in_hours=None, 26 + ), 27 + ] 28 + )

+11

rules/record/profile/index.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/profile.sml', 6 + ], 7 + ) 8 + 9 + Require( 10 + rule='rules/record/profile/hailey_profile.sml', 11 + )

+7

rules/record/repost/index.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/repost.sml', 6 + ], 7 + )

+12

rules/record/starterpack/index.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/starterpack.sml', 6 + ], 7 + ) 8 + 9 + Require( 10 + rule='rules/record/starterpack/starter_pack_creations.sml', 11 + require_if=IsCreate, 12 + )

+35

rules/record/starterpack/starter_pack_creations.sml

··· 1 + Import( 2 + rules=[ 3 + 'models/base.sml', 4 + 'models/record/base.sml', 5 + 'models/record/starterpack.sml', 6 + ], 7 + ) 8 + 9 + _Gate = IsCreate 10 + 11 + _CreationsCount = IncrementWindow( 12 + key=f'stpk-create={UserId}', 13 + window_seconds=7 * Day, 14 + when_all=[_Gate], 15 + ) 16 + 17 + MultipleStarterPackCreations = Rule( 18 + when_all=[ 19 + _Gate, 20 + _CreationsCount > 2, 21 + ], 22 + description='Account made more than two starter packs in a week', 23 + ) 24 + 25 + WhenRules( 26 + rules_any=[MultipleStarterPackCreations], 27 + then=[ 28 + AtprotoLabel( 29 + entity=UserId, 30 + label='stpk-creations', 31 + comment='Account made more than two starter packs in a week', 32 + expiration_in_hours=7 * 24, 33 + ), 34 + ], 35 + )