Filter anything with RegexieBot

An advanced text filter for Telegram groups. Never speaks unless spoken to, no in-group ads and premium features.

Introduction

RegexieBot is a regex-based filter for Telegram groups. It filters not only messages but also message entities and user names. It is mainly used to:

  • detect particular words, domains, usernames, writing systems, emojis, any sequences of Unicode characters
  • reply to commands or questions
  • enforce a particular pattern in writing messages, e.g. to include certain hashtags at the beginning of messages or do not use more than two emojis

You don’t have to worry about premium subscriptions, referral programs, in-group ads, and other unpleasant things that everyone faces nowadays.

@RegexieBot is the only instance at the moment, everything else is not us.

Energy ⚡️

Bots, like any other programs, use real energy resources. RegexieBot calculates its virtual energy (further – energy or ⚡️) usage using the formula:

energy_used = number_of_messages * (10 + message_processing_price)

The message_processing_price is the sum of the prices of active rules, which you can see at the top of the rules and filters. It’s calculated using a special algorithm when you create a rule.

Constant 10 is basically passive consumption — every message in the group subtracts 10⚡️ from your balance once you’ve connected the bot. Telegram doesn’t allow bots to block messages so the bot have to download messages even when rules are inactive. You can drain all your energy without actually using the bot, so disconnect the bot from big groups if it’s no longer needed.

RegexieBot gives each user 1M⚡️ at the beginning of every month. You can also buy additional energy, which will be used if you run out of the free reserve. When your reserve is getting empty, your groups are deactivated.

Let’s practice.

Imagine you have two groups: Group A, which uses 31⚡️ per message, and Group B, which uses 19⚡️. You need to decide whether to buy additional energy and, if so, how much to buy. You know that the approximate number of messages in Group A is 10K per month, and in Group B, it is 25K. The calculation 10K×31+25K×19≈0.8M⚡️ suggests that the default energy should be enough. However, if the number of messages in Group A exceeds 1M/31≈32K, the bot will be deactivated.

Let’s say both of your groups have grown by 10K messages per month. The calculation 20K×31+35K×19≈1.3M⚡️ shows that you exceed the free monthly limit by 300K⚡️. You can buy 1M ⚡️, which should be enough for a couple of months.

Rules and regular expressions

The bot can filter not only messages, but also names and some text entities. Thus, there are multiple text filters with their own built-in rules and special options. A rule is just a pattern + action.

There are two types of patterns — simple and regex. Simple pattern is a list of strings and basic parameters like case sensitivity. If your task is more complex you can use good old regular expressions compatible with JS. regex101.com and various LLMs might be helpful in creating and testing regular expressions.

The consumption of each pattern is calculated using a special algorithm based on benchmarking. The algorithm is far from being perfect, it may show different results for the same rule, but we’re working on its improvement. Try to experiment to find the best formula.

Entity filters were introduced to simplify pattern creation and lower consumption for the most common text entities. For example, if you want to filter google.com, you don’t need to invent something like /\b(https?:\/\/)?(.+.)?google.com(\/.+)?\b/i and spend a lot of energy on it. Telegram has already parsed the link, so just add /\bgoogle.com(\/|$)/i to the special links filter. This regex isn’t perfect but simple and energy-efficient.